Measuring and interpreting lexical dispersion in corpus linguistics

Autor: Douglas Biber, Jesse Egbert, Brent D. Burch
Rok vydání: 2017
Předmět:
Zdroj: Journal of Research Design and Statistics in Linguistics and Communication Science. 3:189-216
ISSN: 2052-4188
2052-417X
DOI: 10.1558/jrds.33066
Popis: The frequency of occurrence and the dispersion of a word are measures of a word’s importance in a collection of texts or a corpus. In particular, lexical dispersion is a statistic in corpus linguistics that measures a word’s homogeneity across the parts of a corpus. There are different ways to measure dispersion and the authors compare three approaches. Both formulaic and interpretative issues pertaining to dispersion are discussed in terms of the frequency of a word in the corpus parts and the variability of a word across the corpus. A simulation study and an application involving words from the British National Corpus indicate that the index constructed from the difference between every possible pair of frequencies of the word in the parts of a corpus is preferred.
Databáze: OpenAIRE