Skip to main content

Automatic transliteration from Arabic to English and its impact on machine translation

Resource type
Thesis type
(Thesis) M.Sc.
Date created
Transcribing named entities from one language into another is called transliteration. This thesis proposes a novel spelling-based method for the automatic transliteration of named entities from Arabic to English which exploits various types of letter-based alignments. The approach consists of three phases: the first phase uses single letter alignments, the second phase uses alignments over groups of letters to deal with diacritics and missing vowels in the English output, and the third phase exploits various knowledge sources to repair any remaining errors. The results show a top-20 accuracy rate of up to 88%. Our algorithm is examined in the context of a machine translation task. We provide an in-depth analysis of the integration of our Arabic-to-English transliteration system into a general-purpose phrase-based statistical machine translation system. Our experiments show that a transliteration module can help significantly in the situation where the test data is rich with previously unseen named entities.
Copyright statement
Copyright is held by the author.
The author has not granted permission for the file to be printed nor for the text to be copied and pasted. If you would like a printable copy of this thesis, please contact
Scholarly level
Member of collection
Download file Size
etd3203.pdf 771.98 KB

Views & downloads - as of June 2023

Views: 33
Downloads: 1