A tree-based approach for English-to-Turkish translation
Autor: | Begüm Avar, Olcay Taner Yildiz, Özge Bakay |
---|---|
Přispěvatelé: | Işık Üniversitesi, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü, Işık University, Faculty of Engineering, Department of Computer Engineering, Yıldız, Olcay Taner |
Rok vydání: | 2019 |
Předmět: |
Morphological analysis
Structural rules Root (linguistics) Phrase Hierarchical phrase-based General Computer Science Machine translation Target words Computer science Turkish Mühendislik Computational linguistics computer.software_genre Permutation Engineering Phrase-based models Structural modifications Electrical and Electronic Engineering 10-fold cross-validation Tree-based approach business.industry Natural Language Processing Machine Translation Tree Translation language.human_language Tree (data structure) language Replacement algorithm Artificial intelligence business computer Computer aided language translation Natural language processing Word (computer architecture) |
Zdroj: | Volume: 27, Issue: 1 437-452 Turkish Journal of Electrical Engineering and Computer Science |
ISSN: | 1303-6203 1300-0632 |
Popis: | In this paper, we present our English-to-Turkish translation methodology, which adopts a tree-based approach. Our approach relies on tree analysis and the application of structural modification rules to get the target side (Turkish) trees from source side (English) ones. We also use morphological analysis to get candidate root words and apply tree-based rules to obtain the agglutinated target words. Compared to earlier work on English-to-Turkish translation using phrase-based models, we have been able to obtain higher BLEU scores in our current study. Our syntactic subtree permutation strategy, combined with a word replacement algorithm, provides a 67% relative improvement from a baseline 12.8 to 21.4 BLEU, all averaged over 10-fold cross-validation. As future work, improvements in choosing the correct senses and structural rules are needed. This work was supported by TUBITAK project 116E104 Publisher's Version |
Databáze: | OpenAIRE |
Externí odkaz: |