Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges

Autor: Arivazhagan, Naveen, Bapna, Ankur, Firat, Orhan, Lepikhin, Dmitry, Johnson, Melvin, Krikun, Maxim, Chen, Mia Xu, Cao, Yuan, Foster, George, Cherry, Colin, Macherey, Wolfgang, Chen, Zhifeng, Wu, Yonghui
Rok vydání: 2019
Předmět:
Druh dokumentu: Working Paper
Popis: We introduce our efforts towards building a universal neural machine translation (NMT) system capable of translating between any language pair. We set a milestone towards this goal by building a single massively multilingual NMT model handling 103 languages trained on over 25 billion examples. Our system demonstrates effective transfer learning ability, significantly improving translation quality of low-resource languages, while keeping high-resource language translation quality on-par with competitive bilingual baselines. We provide in-depth analysis of various aspects of model building that are crucial to achieving quality and practicality in universal NMT. While we prototype a high-quality universal translation system, our extensive empirical analysis exposes issues that need to be further addressed, and we suggest directions for future research.
Databáze: arXiv