Primitive fitting for unsupervised single-view 3D reconstruction

Resource type
Thesis type
(Thesis) M.Sc.
Date created
The use of 3D reconstruction technology in agriculture can offer significant benefits in analyzing fruit quality; however, the complex aggregate shape of fruit bundles such as grapes and the absence of accurate 3D ground truth data are key challenges. In this work, we address these challenges via a self-supervised, two-stage approach that takes a 2D image as input and generates a 3D representation of fruit bundles consisting of its constituent primitives presented in the form of a textured mesh, along with the camera pose that produces the closest projection, as output. We begin with a training stage, after which we utilize the trained network to predict the deformed coarse shape, texture, and camera pose for an unseen image. Following this, we move on to a primitive refinement stage where we convert the deformed shape into a primitive-based representation. During the training and primitive refinement stages, our objective is to simultaneously minimize 2D losses to ensure precise per-instance reconstruction and adhere to global 3D priors that increase the likelihood of generating a mesh that appears realistic from various views. Our model outperforms baseline approaches based on 2D mask IoU and 3D Hausdorff distance metrics, as demonstrated through our quantitative evaluation on synthetic 3D grape models. Moreover, the qualitative evaluation on real fruit bundles shows that our model can generate highly realistic 3D shapes with intricate primitive-level details, which look realistic from different viewpoints.
32 pages.
Copyright statement
Copyright is held by the author(s).
This thesis may be printed or downloaded for non-commercial research and scholarly purposes.
Supervisor or Senior Supervisor
Thesis advisor: Chen, Mo
Member of collection
Download file Size
etd22372.pdf 16.78 MB

Views & downloads - as of June 2023

Views: 30
Downloads: 0