Quantitative analysis of lexical complexity in contemporary Russian novels
Autor: | Yu S Maslennikova, A V Abramov |
---|---|
Rok vydání: | 2019 |
Předmět: | |
Zdroj: | Journal of Physics: Conference Series. 1391:012145 |
ISSN: | 1742-6596 1742-6588 |
DOI: | 10.1088/1742-6596/1391/1/012145 |
Popis: | This research presents a set of methods for measuring the lexical complexity against the culture at large. We applied a range of statistical measures upon a set of texts. For the analysis, we selected 50 famous Russian novels of the 20th century based on their historical and generic variety. The main approach consists of the comparison of the relative frequency of a work’s words against the Google Books dataset. This dataset represents a remarkable resource, with the Russian Google Books corpus in the period 1800 - 2012, containing approximately 4.7 billion 1-grams. Relative frequencies distribution for a novels words were compared with frequencies of Google Books corpus of different years using JensenShannon divergence, KullbackLeibler divergence and other information measures. Also, so-called a Flesch reading ease scale was calculated, such measure was used in similar research of modern English fiction. It was shown that the lexical complexity of individual texts should be measured against the culture at large. It was found that a writer who commonly seems to be difficult, verbose or notional, can in fact use language that is more ‘common’ (relative to the culture at large) than any other texts. |
Databáze: | OpenAIRE |
Externí odkaz: |