Method of the coherence evaluation of Ukrainian text

Autor: Pogorilyy, S. D., Kramov, A. A.
Rok vydání: 2020
Předmět:
Zdroj: Data Recording, Storage & Processing. 2018. Vol. 20, Issue 4. P. 64-75
Druh dokumentu: Working Paper
DOI: 10.35681/1560-9189.2018
Popis: Due to the growing role of the SEO technologies, it is necessary to perform an automated analysis of the article's quality. Such approach helps both to return the most intelligible pages for the user's query and to raise the web sites positions to the top of query results. An automated assessment of a coherence is a part of the complex analysis of the text. In this article, main methods for text coherence measurements for Ukrainian language are analyzed. Expediency of using the semantic similarity graph method in comparison with other methods are explained. It is suggested the improvement of that method by the pre-training of the neural network for vector representations of sentences. Experimental examination of the original method and its modifications is made. Training and examination procedures are made on the corpus of Ukrainian texts, which were previously retrieved from abstracts and full texts of Ukrainian scientific articles. The testing procedure is implemented by performing of two typical tasks for the text coherence assessment: document discrimination task and insertion task. Accordingly to the analysis it is defined the most effective combination of method's modification and its parameter for the measurement of the text coherence.
Comment: 16 pages, in Ukrainian, 5 figures
Databáze: arXiv