A Comparative Study of LLMs, NMT Models, and Their Combination in Persian-English Idiom Translation

Autor:	Rezaeimanesh, Sara, Hosseini, Faezeh, Yaghoobzadeh, Yadollah
Rok vydání:	2024
Předmět:	Computer Science - Computation and Language I.2.7
Druh dokumentu:	Working Paper
Popis:	Large language models (LLMs) have shown superior capabilities in translating figurative language compared to neural machine translation (NMT) systems. However, the impact of different prompting methods and LLM-NMT combinations on idiom translation has yet to be thoroughly investigated. This paper introduces two parallel datasets of sentences containing idiomatic expressions for Persian$\rightarrow$English and English$\rightarrow$Persian translations, with Persian idioms sampled from our PersianIdioms resource, a collection of 2,200 idioms and their meanings. Using these datasets, we evaluate various open- and closed-source LLMs, NMT models, and their combinations. Translation quality is assessed through idiom translation accuracy and fluency. We also find that automatic evaluation methods like LLM-as-a-judge, BLEU and BERTScore are effective for comparing different aspects of model performance. Our experiments reveal that Claude-3.5-Sonnet delivers outstanding results in both translation directions. For English$\rightarrow$Persian, combining weaker LLMs with Google Translate improves results, while Persian$\rightarrow$English translations benefit from single prompts for simpler models and complex prompts for advanced ones.
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2412.09993 Zobrazit plný text záznamu View this record from Arxiv