Skip to main content

Discriminative key-segment model for interaction detection

Resource type
Thesis type
(Thesis) M.Sc.
Date created
2014-03-11
Authors/Contributors
Abstract
Automatic activity detection in videos has several applications in visual surveillance, video retrieval, and human-computer interaction. The task, at its core, requires expressive models of activities. The models that represent activities as arrangements of key components are generally more descriptive and robust to challenges such as occlusion, clutter, and high intra-class variability. They can thus lead to improved classification performance. Following this idea, we model human-object interactions as sequences of locally discriminative temporal segments capturing objects appearance and their interrelations. In a two-stage pipeline, we coarsely localize humans and objects in long videos. We then more closely examine their content using our key-segment model trained in the latent SVM framework. We evaluate our approach on VIRAT Ground Dataset Release 2.0 for detecting instances of human-vehicle interactions. Results show that our key-segment model significantly outperforms the common global Bag of Words approach.
Document
Identifier
etd8290
Copyright statement
Copyright is held by the author.
Permissions
The author has not granted permission for the file to be printed nor for the text to be copied and pasted. If you would like a printable copy of this thesis, please contact summit-permissions@sfu.ca.
Scholarly level
Supervisor or Senior Supervisor
Thesis advisor: Mori, Greg
Member of collection
Download file Size
etd8290_YSefidgar.pdf 23.6 MB

Views & downloads - as of June 2023

Views: 0
Downloads: 0