Examining the Part-of-speech Features in Assessing the Readability of Vietnamese Texts

Autor: An-Vinh Luong, Diep Nguyen, Dien Dinh
Jazyk: English<br />Slovenian
Rok vydání: 2020
Předmět:
Zdroj: Acta Linguistica Asiatica, Vol 10, Iss 2 (2020)
Druh dokumentu: article
ISSN: 2232-3317
DOI: 10.4312/ala.10.2.127-142
Popis: The readability of the text plays a very important role in selecting appropriate materials for the level of the reader. Text readability in Vietnamese language has received a lot of attention in recent years, however, studies have mainly been limited to simple statistics at the level of a sentence length, word length, etc. In this article, we investigate the role of word-level grammatical characteristics in assessing the difficulty of texts in Vietnamese textbooks. We have used machine learning models (for instance, Decision Tree, K-nearest neighbor, Support Vector Machines, etc.) to evaluate the accuracy of classifying texts according to readability, using grammatical features in word level along with other statistical characteristics. Empirical results show that the presence of POS-level characteristics increases the accuracy of the classification by 2-4%.
Databáze: Directory of Open Access Journals