Developing an effective scheme for translation and expansion of Persian user queries

Autor: Saeideh Ebrahimy, Mehdi Mohammadi, Razieh Esmailpour, Seyed Mostafa Fakhrahmad, Javad Abbaspour
Rok vydání: 2019
Předmět:
Zdroj: Digital Scholarship in the Humanities.
ISSN: 2055-768X
2055-7671
DOI: 10.1093/llc/fqz041
Popis: This study aims at introducing a new source for translation and expansion of user queries in Persian language in order to develop a bilingual dictionary. For the purpose of this study, required data were extracted and processed from English and Persian bibliographic information of journal articles to develop a dictionary for query translation and expansion, denoted as Query Expansion Assistant Database (QEAD). In this study, psychology and educational sciences journals have been selected as the sample with the potential of extension to other domains. Persian–English authors’ keywords were used for translation part and titles of English references were used to extract phrases using natural language processing techniques for the expansion part. The proposed algorithm is demonstrated. Then we evaluated this approach using human evaluation by using Google translate (GT) and Google scholar. Although the evaluation of translation part indicated 60% match between GT and QEAD, in 40% of unmatched translations, QEAD showed a better performance according to expert evaluators. Expansion part of QEAD was compared with Google scholar suggestions, which indicated that the expanded words of QEAD can equalize with Google scholar suggestions. Persian as a low resource language needs more qualified lexicon translation. In addition, using the English–Persian bibliographic information of scientific journals to mine lexicon translation is conducted for the first time. Since these journals are peer-reviewed, they can be a valuable source for translation of user’s query. Users can be informed of the most prevalent and up-to-date words or phrases among scientists, because journals are published frequently.
Databáze: OpenAIRE