Sefidgar, Yasaman Sadat

Resource type

Thesis

Thesis type

(Thesis) M.Sc.

Date created

2014-03-11

Authors/Contributors

Author: Sefidgar, Yasaman Sadat

Abstract

Automatic activity detection in videos has several applications in visual surveillance, video retrieval, and human-computer interaction. The task, at its core, requires expressive models of activities. The models that represent activities as arrangements of key components are generally more descriptive and robust to challenges such as occlusion, clutter, and high intra-class variability. They can thus lead to improved classification performance. Following this idea, we model human-object interactions as sequences of locally discriminative temporal segments capturing objects appearance and their interrelations. In a two-stage pipeline, we coarsely localize humans and objects in long videos. We then more closely examine their content using our key-segment model trained in the latent SVM framework. We evaluate our approach on VIRAT Ground Dataset Release 2.0 for detecting instances of human-vehicle interactions. Results show that our key-segment model significantly outperforms the common global Bag of Words approach.

Keywords

Identifier

etd8290

Copyright statement

Copyright is held by the author.

Permissions

The author has not granted permission for the file to be printed nor for the text to be copied and pasted. If you would like a printable copy of this thesis, please contact summit-permissions@sfu.ca.

Scholarly level

Graduate student (Masters)

Supervisor or Senior Supervisor

Thesis advisor: Mori, Greg

Member of collection

Computing Science Theses

Download file	Size
etd8290_YSefidgar.pdf	23.6 MB

Discriminative key-segment model for interaction detection

Keywords

Views & downloads - as of June 2023