Haresh, Sanjay

Resource type

Thesis

Thesis type

(Thesis) M.Sc.

Date created

2023-04-03

Authors/Contributors

Author: Haresh, Sanjay

Abstract

Human-object interactions with articulated objects are common in everyday life. Despite much progress in single-view 3D reconstruction, it is still challenging to infer an articulated 3D object model from an RGB video showing a person manipulating the object. We canonicalize the task of articulated 3D human-object interaction reconstruction from RGB video, and carry out a systematic benchmark of five families of methods for this task: 3D plane estimation, 3D cuboid estimation, CAD model fitting, implicit field fitting, and free-form mesh fitting. Our experiments show that all methods struggle to obtain high accuracy results even when provided ground truth information about the observed objects. At the same time, we also found that highly constrained object shape representations (e.g. CAD models) work much better than unconstrained representations (e.g. free-form meshes). We also identify key factors which make the task challenging and suggest directions for future work on this challenging 3D computer vision task.

Extent

35 pages.

Keywords

Identifier

etd22429

Copyright statement

Copyright is held by the author(s).

Permissions

This thesis may be printed or downloaded for non-commercial research and scholarly purposes.

Supervisor or Senior Supervisor

Thesis advisor: Savva, Manolis

Language

English

Member of collection

Computing Science Theses

Download file	Size
etd22429.pdf	16.35 MB

Articulated 3D human-object Interactions from RGB videos: An empirical analysis of approaches and challenges

Keywords

Views & downloads - as of June 2023