Natural Language Generation is a subfield of Natural Language Processing, which is concerned with automatically creating human readable text from non-linguistic forms of information. A template-based approach to Natural Language Generation utilizes base formats for different types of sentences, which are subsequently transformed to create the final readable forms of the output. In this thesis, we investigate the suitability of a template-based approach to multilingual Natural Language Generation of sports summaries. We implement a system to generate English and Bangla summaries making use of a pipelined architecture to transform data in multiple stages. Additionally, we demonstrate how the automatically generated summaries differ from human generated summaries. We show that by using a template-based approach the system can generate acceptable output in multiple languages without requiring detailed grammatical knowledge, which is important for languages such as Bangla where computational resources are still scarce.
Copyright is held by the author.
The author granted permission for the file to be printed and for the text to be copied and pasted.
Supervisor or Senior Supervisor
Thesis advisor: Popowich, Fred
Member of collection