Bootstrapping via graph propagation

Date created: 
2012-08-23
Identifier: 
etd7421
Keywords: 
Natural language processing, computational linguistics, machine learning, unsupervised learning, semi-supervised learning, bootstrapping, graph propagation
Abstract: 

The Yarowsky algorithm is a simple self-training algorithm for bootstrapping learning from a small number of initial seed rules which has proven very effective in several natural language processing tasks. Bootstrapping a classifier from a small set of seed rules can be viewed as the propagation of labels between examples via features shared between them. This thesis introduces a novel variant of the Yarowsky algorithm based on this view. It is a bootstrapping learning method which uses a graph propagation algorithm with a well-defined objective function. The experimental results show that our proposed bootstrapping algorithm achieves state of the art performance or better on several different natural language data sets.

Document type: 
Thesis
Rights: 
Copyright remains with the author. The author granted permission for the file to be printed and for the text to be copied and pasted.
File(s): 
Senior supervisor: 
Anoop Sarkar
Department: 
Applied Science: School of Computing Science
Thesis type: 
(Thesis) M.Sc.
Statistics: