Author: Huang, Zhi Feng
In this thesis, we present work towards addressing a grand challenge of computer vision, human action recognition and detection. In particular, we focus on the problem of recognizing and detecting the actions of a person from a video sequence. To recognize human actions in a video, a typical approach involves first detecting and tracking people, followed by classification. However, accurate tracking is challenging, and the state-of-art tracking methods are not reliable. Since accurate tracking is not a direct end-goal of action recognition, we consider tracking as a latent variable and train a model focused on action recognition. We propose a novel learning algorithm for training models with latent variables in a boosting framework. Moreover, we show that the algorithm can be used to train an action recognition model in which the tracking trajectory of a person is a latent variable. This new model outperforms baselines on a variety of datasets.
Copyright is held by the author.
The author granted permission for the file to be printed, but not for the text to be copied and pasted.
Supervisor or Senior Supervisor
Thesis advisor: Mori, Greg
Member of collection