Abu-MaTran at WMT 2015 Translation Task: Morphological Segmentation and Web Crawling

Autor: Raphael Rubino, Tommi A. Pirinen, Miquel Esplà-Gomis, Prokopis Prokopidis, Vassilis Papavassiliou, Sergio Ortiz Rojas, Antonio Toral, Nikola Ljubešić
Rok vydání: 2015
Předmět:
Zdroj: WMT@EMNLP
Popis: This paper presents the machine translation systems submitted by the Abu-MaTran project for the Finnish‐English language pair at the WMT 2015 translation task. We tackle the lack of resources and complex morphology of the Finnish language by (i) crawling parallel and monolingual data from the Web and (ii) applying rule-based and unsupervised methods for morphological segmentation. Several statistical machine translation approaches are evaluated and then combined to obtain our final submissions, which are the top performing English-to-Finnish unconstrained (all automatic metrics) and constrained (BLEU), and Finnish-to-English constrained (TER) systems.
Databáze: OpenAIRE