Multi-person video understanding with deep neural networks

Date created: 
2020-08-25
Identifier: 
etd21063
Keywords: 
Group activity recognition
Video Understanding
Network Compression
Dense Video Captioning
Abstract: 

In this thesis, we present new methods to address multi-person scene understanding. Specifically, we focus on a multi-person task known as group activity recognition. We analyze multi-person scene understanding from the perspective of group activity recognition. We identify key challenges in group activity recognition, and present deep neural networks based approaches to handle these challenges. We show that our proposed approaches achieve competitive performance for group activity recognition. We also study one of the key components of group activity recognition in more detail, that is the problem of sequence modeling, where we apply new sequence modeling methods to the task of dense video captioning. In the end, we also investigate how to compress these large deep neural networks for efficient recognition on specialized domain tasks.

Document type: 
Thesis
Rights: 
This thesis may be printed or downloaded for non-commercial research and scholarly purposes. Copyright remains with the author.
File(s): 
Supervisor(s): 
Greg Mori
Department: 
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) Ph.D.
Statistics: