Translation versus language model pre-training objectives for word sense disambiguation

Resource type
Thesis type
(Thesis) M.Sc.
Date created
2019-12-16
Authors/Contributors
Abstract
Contextual word representations pre-trained on large text data have advanced state of the art in many tasks in Natural Language Processing. Most recent approaches pre-train such models using a language modeling (LM) objective. In this thesis, we compare and contrast such LM models with the encoder of an encoder-decoder model pre-trained using a machine translation (MT) objective. For certain tasks such as word-sense disambiguation the MT task provides an intuitively better pre-training objective since different senses of a word tend to translate differently into a target language, while word senses might not always need to be distinguished when using an LM objective. Our experimental results on word sense disambiguation provide insight into pre-training objective functions and can help guide future work into large-scale pre-trained models for transfer learning in NLP.
Document
Identifier
etd20681
Copyright statement
Copyright is held by the author.
Permissions
This thesis may be printed or downloaded for non-commercial research and scholarly purposes.
Scholarly level
Supervisor or Senior Supervisor
Thesis advisor: Sarkar, Anoop
Member of collection
Attachment Size
etd20681.pdf 1 MB