Transforming neural machine translation into simultaneous text and speech translation

Thesis type
(Thesis) Ph.D.
Date created
2021-07-29
Authors/Contributors
Abstract
Simultaneous neural Machine Translation (SiMT) aims to maintain translation quality while minimizing the delay between reading the input and incrementally producing the output. The eventual goal of SiMT is to match the performance of highly skilled human interpreters who can simultaneously listen to a speaker in a source language and produce a translation in the target language with minimal delay. In this thesis, we explore attempts at building reliable simultaneous translation systems that can produce fluent translations with minimal latency. We present two distinct methods for finding an optimal policy that tells us if current input is enough for generating accurate translations, or we need to wait for more information. Our first method employs a prediction mechanism to inform the model about incoming input stream. We show as the length of sentences grows, predicting future time steps become essential due to more complex re-orderings that can happen more often in long sentence pairs. Our second method introduces a new algorithmic approach for finding optimal policy as a reference in a supervised learning model. The resulting system translates more accurately with less delay. Our third project focuses on improving the performance of an end-to-end speech translation system, which many simultaneous speech systems rely on. We propose a new loss function that allows us to use available huge datasets for Machine Translation task in order to improve the performance of speech translation system.
Document
Identifier
etd21556
Copyright statement
Copyright is held by the author(s).
Permissions
This thesis may be printed or downloaded for non-commercial research and scholarly purposes.
Supervisor or Senior Supervisor
Thesis advisor: Sarkar, Anoop
Language
English
Member of collection
Attachment Size
input_data\22258\etd21556.pdf 2.22 MB