A Statistical Method for English to Arabic Machine Translation
Autor: | Marwan Akeel, Ravi Bhushan Mishra |
---|---|
Rok vydání: | 2014 |
Předmět: |
Arabic machine translation
Morphology (linguistics) Phrase Machine translation Computer science Arabic business.industry Speech recognition InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL computer.software_genre Machine translation software usability language.human_language Inflection ComputingMethodologies_DOCUMENTANDTEXTPROCESSING language Determiner Artificial intelligence business computer Word (computer architecture) Natural language processing |
Zdroj: | International Journal of Computer Applications. 86:13-19 |
ISSN: | 0975-8887 |
DOI: | 10.5120/14957-3124 |
Popis: | Translating from English into a morphologically richer language like Arabic is a challenge in statistical machine translation. Segmentation of Arabic text was introduced to bridge the inflection morphology gap. In this work, we investigate the impact of supporting Arabic morphologically segmented training corpus in a phrase-based statistical machine translation system with one to one dictionary and examine the effects on system performance. The results show that the dictionary improves the quality of the translation output especially when the corpus used is normalized and fully segmented excluding the determiner. The dictionary also decreases the out of vocabulary rate. The effect of the dictionary support with different baseline and factored models using data ranging from full word form to fully segmented forms are also demonstrated. |
Databáze: | OpenAIRE |
Externí odkaz: |