Supertagging is a sequence prediction task where each word is assigned a complex syntactic structure called a supertag. In this thesis, we propose a novel multi-task learning approach for Tree Adjoining Grammar~(TAG) supertagging by deconstructing these complex supertags to a set of related but auxiliary sequence prediction tasks, which can best represent the structural information of each supertag. Our multi-task prediction framework is trained over the same training data used to train the original supertagger, where each auxiliary task provides an alternative view of the original prediction task. Our experimental results show that our multi-task approach significantly improves TAG supertagging with a new state-of-the-art accuracy score of 91.39% on the Penn treebank supertagging dataset. We also show consistent improvement of around 0.4% in tagging accuracy by applying our multi-task prediction framework into various neural supertagging models without using any additional data resources.
Copyright is held by the author.
This thesis may be printed or downloaded for non-commercial research and scholarly purposes.
Supervisor or Senior Supervisor
Thesis advisor: Sarkar, Anoop
Member of collection