Developing an effective scheme for translation and expansion of Persian user queries
Autor: | Saeideh Ebrahimy, Mehdi Mohammadi, Razieh Esmailpour, Seyed Mostafa Fakhrahmad, Javad Abbaspour |
---|---|
Rok vydání: | 2019 |
Předmět: |
Scheme (programming language)
Linguistics and Language Computer science business.industry 05 social sciences 02 engineering and technology Translation (geometry) computer.software_genre Language and Linguistics language.human_language Computer Science Applications 0202 electrical engineering electronic engineering information engineering language 020201 artificial intelligence & image processing Artificial intelligence 0509 other social sciences 050904 information & library sciences business GeneralLiterature_REFERENCE(e.g. dictionaries encyclopedias glossaries) computer Natural language processing Information Systems computer.programming_language Persian |
Zdroj: | Digital Scholarship in the Humanities. |
ISSN: | 2055-768X 2055-7671 |
DOI: | 10.1093/llc/fqz041 |
Popis: | This study aims at introducing a new source for translation and expansion of user queries in Persian language in order to develop a bilingual dictionary. For the purpose of this study, required data were extracted and processed from English and Persian bibliographic information of journal articles to develop a dictionary for query translation and expansion, denoted as Query Expansion Assistant Database (QEAD). In this study, psychology and educational sciences journals have been selected as the sample with the potential of extension to other domains. Persian–English authors’ keywords were used for translation part and titles of English references were used to extract phrases using natural language processing techniques for the expansion part. The proposed algorithm is demonstrated. Then we evaluated this approach using human evaluation by using Google translate (GT) and Google scholar. Although the evaluation of translation part indicated 60% match between GT and QEAD, in 40% of unmatched translations, QEAD showed a better performance according to expert evaluators. Expansion part of QEAD was compared with Google scholar suggestions, which indicated that the expanded words of QEAD can equalize with Google scholar suggestions. Persian as a low resource language needs more qualified lexicon translation. In addition, using the English–Persian bibliographic information of scientific journals to mine lexicon translation is conducted for the first time. Since these journals are peer-reviewed, they can be a valuable source for translation of user’s query. Users can be informed of the most prevalent and up-to-date words or phrases among scientists, because journals are published frequently. |
Databáze: | OpenAIRE |
Externí odkaz: |