Resource type
Thesis type
(Thesis) M.Sc.
Date created
2023-03-31
Authors/Contributors
Author: Dadjooitavakoli, Dorsa
Abstract
The use of 3D reconstruction technology in agriculture can offer significant benefits in analyzing fruit quality; however, the complex aggregate shape of fruit bundles such as grapes and the absence of accurate 3D ground truth data are key challenges. In this work, we address these challenges via a self-supervised, two-stage approach that takes a 2D image as input and generates a 3D representation of fruit bundles consisting of its constituent primitives presented in the form of a textured mesh, along with the camera pose that produces the closest projection, as output. We begin with a training stage, after which we utilize the trained network to predict the deformed coarse shape, texture, and camera pose for an unseen image. Following this, we move on to a primitive refinement stage where we convert the deformed shape into a primitive-based representation. During the training and primitive refinement stages, our objective is to simultaneously minimize 2D losses to ensure precise per-instance reconstruction and adhere to global 3D priors that increase the likelihood of generating a mesh that appears realistic from various views. Our model outperforms baseline approaches based on 2D mask IoU and 3D Hausdorff distance metrics, as demonstrated through our quantitative evaluation on synthetic 3D grape models. Moreover, the qualitative evaluation on real fruit bundles shows that our model can generate highly realistic 3D shapes with intricate primitive-level details, which look realistic from different viewpoints.
Document
Extent
32 pages.
Identifier
etd22372
Copyright statement
Copyright is held by the author(s).
Supervisor or Senior Supervisor
Thesis advisor: Chen, Mo
Language
English
Member of collection
Download file | Size |
---|---|
etd22372.pdf | 16.78 MB |