Mostafavi Kashani, Mehdi

Resource type

Thesis

Thesis type

(Thesis) M.Sc.

Date created

2007

Authors/Contributors

Author: Mostafavi Kashani, Mehdi

Abstract

Transcribing named entities from one language into another is called transliteration. This thesis proposes a novel spelling-based method for the automatic transliteration of named entities from Arabic to English which exploits various types of letter-based alignments. The approach consists of three phases: the first phase uses single letter alignments, the second phase uses alignments over groups of letters to deal with diacritics and missing vowels in the English output, and the third phase exploits various knowledge sources to repair any remaining errors. The results show a top-20 accuracy rate of up to 88%. Our algorithm is examined in the context of a machine translation task. We provide an in-depth analysis of the integration of our Arabic-to-English transliteration system into a general-purpose phrase-based statistical machine translation system. Our experiments show that a transliteration module can help significantly in the situation where the test data is rich with previously unseen named entities.

Keywords

Copyright statement

Copyright is held by the author.

Permissions

The author has not granted permission for the file to be printed nor for the text to be copied and pasted. If you would like a printable copy of this thesis, please contact summit-permissions@sfu.ca.

Scholarly level

Graduate student (Masters)

Language

English

Member of collection

Computing Science Theses

Download file	Size
etd3203.pdf	771.98 KB

Automatic transliteration from Arabic to English and its impact on machine translation

Keywords

Views & downloads - as of June 2023