Latent structure discriminative learning for natural language processing

Date created: 
Discriminative learning
Latent variable models
Structured learning
Statistical machine translation
Language modeling

Natural language is rich with layers of implicit structure, and previous research has shown that we can take advantage of this structure to make more accurate models. Most attempts to utilize forms of implicit natural language structure for natural language processing tasks have assumed a pre-defined structural analysis before training the task-specific model. However, rather than fixing the latent structure, we may wish to discover the latent structure that is most useful via feedback from an extrinsic task. The focus of this thesis is on jointly learning the best latent analysis along with the model for the NLP task we are interested in. In this work, we present a generalized learning framework for discriminative training overjointly learned latent structures, and apply this to several NLP tasks. We develop a high accuracy discriminative language model over shallow parse structures. We demonstrate an efficient algorithm for learning this grammaticality classifier by combining the input of multiple representations of the latent structures. Next, we set forth a framework for latent structure learning for statistical machine translation (SMT), in which the latent segmentation and alignment of the parallel training data inform the translation model. This model jointly optimizes segmentation and alignment for the translation task, novelly learning over latent representations of the input. We also propose a discriminative bilingual topic model over hierarchically structured latent topics, which allows for weighted contributions from more informative inputs and can be optimized for SMT. We apply this model to morphological disambiguation and domain adaptation for SMT. Finally, we give an investigation of large-scale distributed training for structured discriminative models and propose recommendations for distributed computational topologies.

Document type: 
This thesis may be printed or downloaded for non-commercial research and scholarly purposes. Copyright remains with the author.
Senior supervisor: 
Anoop Sarkar
Applied Sciences: School of Computing Science
Thesis type: 
(Dissertation) Ph.D.