Razavi, Marzieh

Resource type

Thesis

Thesis type

(Thesis) M.Sc.

Date created

2012-08-16

Authors/Contributors

Author: Razavi, Marzieh

Abstract

Syntactic parsing and dependency parsing in particular are a core component of many Natural Language Processing (NLP) tasks and applications. Improvements in dependency parsing can help improve machine translation and information extraction applications among many others. In this thesis, we extend the framework of (Koo, Carreras, and Collins, 2008) for dependency parsing which uses a single clustering method for semi-supervised learning. We make use of multiple diverse clustering methods to build multiple discriminative dependency parsing models in the Maximum Spanning Tree (MST) parsing framework (McDonald, Crammer, and Pereira, 2005). All of these diverse clustering-based parsers are then combined together using a novel ensemble model, which performs exact inference on the shared hypothesis space of all the parser models. We show that diverse clustering-based parser models and the ensemble method together significantly improves unlabeled dependency accuracy from 90.82% to 92.46% on Section 23 of the Penn Treebank. We also show significant improvements in domain adaptation to the Switchboard and Brown corpora.

Keywords

Identifier

etd7405

Copyright statement

Copyright is held by the author.

Permissions

The author granted permission for the file to be printed and for the text to be copied and pasted.

Scholarly level

Graduate student (Masters)

Supervisor or Senior Supervisor

Thesis advisor: Sarkar, Anoop

Member of collection

Computing Science Theses

Download file	Size
etd7405_MRazavi.pdf	2.17 MB

Ensembles of diverse clustering-based discriminative dependency parsers

Keywords

Views & downloads - as of June 2023