Statistical Machine Translation between Related and Unrelated Languages

Autor: Kolovratník, D., Klyueva, N., Ondřej Bojar
Rok vydání: 2009
Zdroj: Scopus-Elsevier
Popis: In this paper we describe an attempt to compare how relatedness of languages can influence the performance of statistical machine translation (SMT). We apply the Moses toolkit on the Czech-English-Russian corpus UMC 0.1 in order to train two translation systems: Russian-Czech and English-Czech. The quality of the translation is evaluated on an independent test set of 1000 sentences parallel in all three languages using an automatic metric (BLEU score) as well as manual judgments. We examine whether the quality of Russian-Czech is better thanks to the relatedness of the languages and similar characteristics of word order and morphological richness. Additionally, we present and discuss the most frequent translation errors for both language pairs.
Databáze: OpenAIRE