English-Myanmar(Burmese) Phrase-Based SMT with One-to-One and One-to-Multiple Translations Corpora

Autor: Htun, Honey, Thu, Ye Kyaw, Nyein Nyein Oo, Thepchai Supnithi
Jazyk: angličtina
Rok vydání: 2020
Předmět:
DOI: 10.5281/zenodo.3745793
Popis: This paper contributes the first investigation of machine translation (MT) performance differences between Myanmar and English languages with the use of several possible Myanmar translations for the specific primary educational domain. We also developed both one-to-one and many Myanmar translations corpora (over 8K and 46K sentences) based on old and new English textbooks (including Grade 1 to 3) which are published by the Ministry of Education. Our developing parallel corpora were used for phrase-based statistical machine translation (PBSMT) which is the de facto standard of statistical machine translation. We measured machine translation performance differences among one-to-many English to Myanmar translation corpora. The differences range between 19.68 and 52.38 BLEU scores from English to Myanmar and between 50.17 and 75.12 BLEU scores from Myanmar to English translation. We expect this study can be applied in Myanmar-to-English automatic speech recognition (ASR) development for primary English textbooks. The main purpose is to translate primary English textbooks data correctly even if the children use in several Myanmar conversation styles.
Databáze: OpenAIRE