The TALP-UPC participation in WMT21 news translation task: an mBART-based NMT approach

Autor: Escolano Peinado, Carlos|||0000-0001-6657-673X, Tsiamas, Ioannis|||0000-0003-1049-2515, Basta, Christine Raouf Saad, Ferrando Monsonís, Javier|||0000-0002-2637-0961, Ruiz Costa-Jussà, Marta|||0000-0002-5703-520X, Rodríguez Fonollosa, José Adrián|||0000-0001-9513-7939
Přispěvatelé: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. Doctorat en Teoria del Senyal i Comunicacions, Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, Universitat Politècnica de Catalunya. Doctorat en Intel·ligència Artificial, Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
Jazyk: angličtina
Rok vydání: 2021
Předmět:
Zdroj: UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Popis: This paper describes the submission to the WMT 2021 news translation shared task by the UPC Machine Translation group. The goal of the task is to translate German to French (De-Fr) and French to German (Fr-De). Our submission focuses on fine-tuning a pre-trained model to take advantage of monolingual data. We fine-tune mBART50 using the filtered data, and additionally, we train a Transformer model on the same data from scratch. In the experiments, we show that fine-tuning mBART50 results in 31.69 BLEU for De-Fr and 23.63 BLEU for Fr-De, which increases 2.71 and 1.90 BLEU accordingly, as compared to the model we train from scratch. Our final submission is an ensemble of these two models, further increasing 0.3 BLEU for Fr-De.
Databáze: OpenAIRE