Zhou, Guang-Tong

Resource type

Thesis

Thesis type

(Thesis) Ph.D.

Date created

2015-11-17

Authors/Contributors

Author: Zhou, Guang-Tong

Abstract

Scene recognition is a fundamental and open problem in computer vision. It is an essential component of a variety of real-world applications, including image search, robotics, social media analysis, and many others. The key to success in scene recognition is to well understand the rich semantics embedded in scenes. For example, it is intuitive to label airport for a scene of sky, airplane, road, and building. In this thesis, we identify two directions for exploiting scene semantics. On one hand, we advocate for the discovery of scene parts that correspond to various semantic components in scenes, like objects and surfaces. On the other hand, we promote the discovery of scene structures that capture the spatial relations among scene parts, like sky-above-airplane. By leveraging scene parts and structures in scene recognition, we are able to build strong recognition systems. Our contributions are two-fold. First, we propose two clustering algorithms for the data-driven discovery of semantics in visual data. In detail, we develop latent maximum-margin clustering to model semantics as latent variables, and hierarchical maximum-margin clustering to discover tree structured semantic hierarchies. Our second contribution is the development of two scene recognition methods that leverage scene structure discovery and part discovery. The first method recognizes scene by considering a scene image as a structured collage of objects. The second method discovers scene parts that are both discriminative and representative for scene recognition.

Keywords

Identifier

etd9296

Copyright statement

Copyright is held by the author.

Permissions

This thesis may be printed or downloaded for non-commercial research and scholarly purposes.

Scholarly level

Graduate student (PhD)

Supervisor or Senior Supervisor

Thesis advisor: Mori, Greg

Member of collection

Computing Science Theses

Download file	Size
etd9296_GZhou.pdf	7.83 MB

Toward Scene Recognition by Discovering Semantic Structures and Parts

Keywords

Views & downloads - as of June 2023