Yang, Weilong

Resource type

Thesis

Thesis type

(Thesis) Ph.D.

Date created

2012-09-26

Authors/Contributors

Author: Yang, Weilong

Abstract

Visual Recognition is a central problem in computer vision, and it has numerous potential applications in many dierent elds, such as robotics, human computer interaction, and entertainment. In this dissertation, we propose two discriminative latent variable models for handling challenging visual recognition problems. In particular, we use latent variables to capture and model various prior knowledge in the training data. In the rst model, we address the problem of recognizing human actions from still images. We jointly consider both poses and actions in a unied framework, and treat human poses as latent variables. The learning of this model follows the framework of latent SVM. Secondly, we propose another latent variable model to address the problem of automated tag learning on YouTube videos. In particular, we address the semantic variations (sub-tags) of the videos which have the same tag. In the model, each video is assumed to be associated with a sub-tag label, and we treat this sub-tag label as latent information. This model is trained using a latent learning framework based on LogitBoost, which jointly considers both the latent sub-tag label and the tag label. Moreover, we propose a novel discriminative latent learning framework, kernel latent SVM, which combines the benet of latent SVM and kernel methods. The framework of kernel latent SVM is general enough to be applied in many applications of visual recognition. It is also able to handle complex latent variables with interdependent structures using composite kernels.

Keywords

Identifier

etd7478

Copyright statement

Copyright is held by the author.

Permissions

The author granted permission for the file to be printed and for the text to be copied and pasted.

Scholarly level

Graduate student (PhD)

Supervisor or Senior Supervisor

Thesis advisor: Mori, Greg

Member of collection

Computing Science Theses

Download file	Size
etd7478_WYang.pdf	18.58 MB

Discriminative latent variable models for visual recognition

Keywords

Views & downloads - as of June 2023