Creating a list of word alignments from parallel Russian simplification data.
Autor: | Dmitrieva A; Faculty of Arts, University of Helsinki, Helsinki, Finland.; Language and Cognition Laboratory, Pushkin State Russian Language Institute, Moscow, Russia., Laposhina A; Language and Cognition Laboratory, Pushkin State Russian Language Institute, Moscow, Russia., Lebedeva MY; Language and Cognition Laboratory, Pushkin State Russian Language Institute, Moscow, Russia. |
---|---|
Jazyk: | angličtina |
Zdroj: | Frontiers in artificial intelligence [Front Artif Intell] 2022 Sep 12; Vol. 5, pp. 984759. Date of Electronic Publication: 2022 Sep 12 (Print Publication: 2022). |
DOI: | 10.3389/frai.2022.984759 |
Abstrakt: | This work describes the development of a list of monolingual word alignments taken from parallel Russian simplification data. This word lists can be used in such lexical simplification tasks as rule-based simplification applications and lexically constrained decoding for neural machine translation models. Moreover, they constitute a valuable source of information for developing educational materials for teaching Russian as a second/foreign language. In this work, a word list was compiled automatically and post-edited by human experts. The resulting list contains 1409 word pairs in which each "complex" word has an equivalent "simpler" (shorter, more frequent, modern, international) synonym. We studied the contents of the word list by comparing the frequencies of the words in the pairs and their levels in the special CEFR-graded vocabulary lists for learners of Russian as a foreign language. The evaluation demonstrated that lexical simplification by means of single-word synonym replacement does not occur often in the adapted texts. The resulting list also illustrates the peculiarities of the lexical simplification task for L2 learners, such as the choice of a less frequent but international word. Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. (Copyright © 2022 Dmitrieva, Laposhina and Lebedeva.) |
Databáze: | MEDLINE |
Externí odkaz: |