Toward Building a Comprehensive Phrase-based English-Arabic Statistical Machine Translation System
Autor: | Sara Ebrahim, Doaa Hegazy, Mostafa G. M. Mostafa, Samha R. El-Beltagy |
---|---|
Rok vydání: | 2017 |
Předmět: |
Arabic machine translation
Phrase Machine translation Computer science Arabic business.industry Scale (chemistry) Arabic natural language processing computer.software_genre Pipeline (software) language.human_language language Machine translation system Artificial intelligence business computer Natural language processing |
Zdroj: | The Egyptian Journal of Language Engineering. 4:10-26 |
ISSN: | 2356-8216 |
Popis: | This paper explores a phrase-based statistical machine translation (PBSMT) pipeline for English-Arabic (En-Ar)language pair. The work surveys the most recent experiments conducted to enhance Arabic machine translation in the En-Ar direction. It also focuses on free datasets and linguistically motivated ideas that enhance phrase-based En-Ar statistical machine translation (SMT) as it is as aims to use those only in order to build a large scale En-Ar SMT system. In addition, the paper highlights Arabic linguistic challenges in Machine Translation (MT) in general. This paper can be considered a guide for building an En-Ar PBSMT system. Furthermore, the presented pipeline can be generalized to any language pairs. |
Databáze: | OpenAIRE |
Externí odkaz: |