Measuring and interpreting lexical dispersion in corpus linguistics
Autor: | Douglas Biber, Jesse Egbert, Brent D. Burch |
---|---|
Rok vydání: | 2017 |
Předmět: | |
Zdroj: | Journal of Research Design and Statistics in Linguistics and Communication Science. 3:189-216 |
ISSN: | 2052-4188 2052-417X |
DOI: | 10.1558/jrds.33066 |
Popis: | The frequency of occurrence and the dispersion of a word are measures of a word’s importance in a collection of texts or a corpus. In particular, lexical dispersion is a statistic in corpus linguistics that measures a word’s homogeneity across the parts of a corpus. There are different ways to measure dispersion and the authors compare three approaches. Both formulaic and interpretative issues pertaining to dispersion are discussed in terms of the frequency of a word in the corpus parts and the variability of a word across the corpus. A simulation study and an application involving words from the British National Corpus indicate that the index constructed from the difference between every possible pair of frequencies of the word in the parts of a corpus is preferred. |
Databáze: | OpenAIRE |
Externí odkaz: |