Resource type
Thesis type
(Thesis) Ph.D.
Date created
2015-11-17
Authors/Contributors
Author: Zhou, Guang-Tong
Abstract
Scene recognition is a fundamental and open problem in computer vision. It is an essential component of a variety of real-world applications, including image search, robotics, social media analysis, and many others. The key to success in scene recognition is to well understand the rich semantics embedded in scenes. For example, it is intuitive to label airport for a scene of sky, airplane, road, and building. In this thesis, we identify two directions for exploiting scene semantics. On one hand, we advocate for the discovery of scene parts that correspond to various semantic components in scenes, like objects and surfaces. On the other hand, we promote the discovery of scene structures that capture the spatial relations among scene parts, like sky-above-airplane. By leveraging scene parts and structures in scene recognition, we are able to build strong recognition systems. Our contributions are two-fold. First, we propose two clustering algorithms for the data-driven discovery of semantics in visual data. In detail, we develop latent maximum-margin clustering to model semantics as latent variables, and hierarchical maximum-margin clustering to discover tree structured semantic hierarchies. Our second contribution is the development of two scene recognition methods that leverage scene structure discovery and part discovery. The first method recognizes scene by considering a scene image as a structured collage of objects. The second method discovers scene parts that are both discriminative and representative for scene recognition.
Document
Identifier
etd9296
Copyright statement
Copyright is held by the author.
Scholarly level
Supervisor or Senior Supervisor
Thesis advisor: Mori, Greg
Member of collection
Download file | Size |
---|---|
etd9296_GZhou.pdf | 7.83 MB |