Visual perception is one of the core building blocks of achieving general machine intelligence. Deep learning methods have expedited the progress of this field and advanced the performance of many visual recognition systems. However, there is still a clear gap between human and machine models on both the understanding of visual data and the complexity of tasks that can be performed. One of the important observations is that humans seem expert at using finite data to generalize in impressive reasoning, besides simply memorizing information. This concept can be applied to various domains and leads to some more important modeling properties, such as compositional modeling, symbolic method, structured models, etc. In this thesis, we will mainly contribute to two broad classes of tasks: recognition and generation on visual data, and show progress in modeling and application based on those concepts. In detail, firstly, we propose unified graph-based models that combine graphs into deep neural networks and further develop a meta-reasoning model for more flexible inference. Secondly, we bring symbolic methods into generative modeling and show how the concise and powerful representation in the symbolic method can be helpful in complex scene generation. Lastly, we propose a model that tries to address highly complex graphical distribution by flow-based method.
Copyright is held by the author.
This thesis may be printed or downloaded for non-commercial research and scholarly purposes.
Supervisor or Senior Supervisor
Thesis advisor: Mori, Greg
Member of collection