Constructing a Turkish-English parallel treebank

Autor: Razieh Ehsani, Ercan Solak, Olcay Taner Yıldız, Onur Görgün
Přispěvatelé: Işık Üniversitesi, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü, Işık University, Faculty of Engineering, Department of Computer Engineering, Yıldız, Olcay Taner, Solak, Ercan, Görgün, Onur, Ehsani, Razieh
Zdroj: Scopus-Elsevier
ACL (2)
Popis: In this paper, we report our preliminary efforts in building an English-Turkish parallel treebank corpus for statistical machine translation. In the corpus, we manually generated parallel trees for about 5,000 sentences from Penn Treebank. English sentences in our set have a maximum of 15 tokens, including punctuation. We constrained the translated trees to the reordering of the children and the replacement of the leaf nodes with appropriate glosses. We also report the tools that we built and used in our tree translation task. Publisher's Version
Databáze: OpenAIRE