A Statistical Method for English to Arabic Machine Translation

Autor: Marwan Akeel, Ravi Bhushan Mishra
Rok vydání: 2014
Předmět:
Zdroj: International Journal of Computer Applications. 86:13-19
ISSN: 0975-8887
DOI: 10.5120/14957-3124
Popis: Translating from English into a morphologically richer language like Arabic is a challenge in statistical machine translation. Segmentation of Arabic text was introduced to bridge the inflection morphology gap. In this work, we investigate the impact of supporting Arabic morphologically segmented training corpus in a phrase-based statistical machine translation system with one to one dictionary and examine the effects on system performance. The results show that the dictionary improves the quality of the translation output especially when the corpus used is normalized and fully segmented excluding the determiner. The dictionary also decreases the out of vocabulary rate. The effect of the dictionary support with different baseline and factored models using data ranging from full word form to fully segmented forms are also demonstrated.
Databáze: OpenAIRE