Enhancing Machine Translation of Academic Course Catalogues with Terminological Resources
Autor: | Adriano Ferraresi, Federico Gaspari, Silvia Bernardini, Randy Scansani, Marcello Soffritti |
---|---|
Přispěvatelé: | Scansani, Randy, Silvia Bernardini, Adriano Ferraresi, Federico Gaspari, Marcello Soffritti, Randy, Scansani, Silvia, Bernardini, Adriano, Ferraresi, Gaspari, F, Marcello, Soffritti |
Jazyk: | angličtina |
Rok vydání: | 2017 |
Předmět: |
Phrase
Machine translation Computer science business.industry media_common.quotation_subject Computational Linguistics Machine Translation Terminology computer.software_genre language.human_language Course (navigation) Terminology German ComputingMethodologies_DOCUMENTANDTEXTPROCESSING language Quality (business) Artificial intelligence Computational linguistics business Baseline (configuration management) computer Natural language processing media_common |
Popis: | This paper describes an approach to translating course unit descriptions from Italian and German into English, using a phrase-based machine translation (MT) system. The genre is very prominent among those requiring translation by universities in European countries in which English is a non-native language. For each language combination, an in-domain bilingual corpus including course unit and degree program descriptions is used to train an MT engine, whose output is then compared to a baseline engine trained on the Europarl corpus. In a subsequent experiment, a bilingual terminology database is added to the training sets in both engines and its impact on the output quality is evaluated based on BLEU and post-editing score. Results suggest that the use of domain-specific corpora boosts the engines quality for both language combinations, especially for German-English, whereas adding terminological resources does not seem to bring notable benefits. |
Databáze: | OpenAIRE |
Externí odkaz: |