Using character N-grams to explorediachronic change in medieval English
Autor: | Kevin Buckley, Carl Vogel |
---|---|
Rok vydání: | 2019 |
Předmět: |
050101 languages & linguistics
010104 statistics & probability Linguistics and Language Character (mathematics) Computer science 05 social sciences Theoretical linguistics Historical linguistics 0501 psychology and cognitive sciences 0101 mathematics 01 natural sciences Language and Linguistics Linguistics |
Zdroj: | Folia Linguistica. 53:249-299 |
ISSN: | 1614-7308 0165-4004 |
DOI: | 10.1515/flih-2019-0012 |
Popis: | This paper applies character N-grams to the study of diachronic linguistic variation in a historical language. The period selected for this initial exploratory study is medieval English, a well-studied period of great linguistic variation and language contact, whereby the efficacy of computational techniques can be examined through comparison to the wealth of thorough scholarship on medieval linguistic variation. Frequency profiles of character N-gram features were generated for several epochs in the history of English and a measure of language distance was employed to quantify the similarity between English at different stages in its history. Through this a quantification of internal change in English was achieved. Furthermore similarity between English and other medieval languages across time was measured allowing for a measurement of the well-known period of contact between English and Anglo-Norman French. This methodology is compared to traditional lexicostatistical methods and shown to be able to derive the same patterns as those derived from expert-created feature lists (i.e. Swadesh lists). The use of character N-gram profiles proved to be a flexible and useful method to study diachronic variation, allowing for the highlighting of relevant features of change. This method may be a complement to traditional qualitative examinations. |
Databáze: | OpenAIRE |
Externí odkaz: |