Using character N-grams to explorediachronic change in medieval English

Autor: Kevin Buckley, Carl Vogel
Rok vydání: 2019
Předmět:
Zdroj: Folia Linguistica. 53:249-299
ISSN: 1614-7308
0165-4004
DOI: 10.1515/flih-2019-0012
Popis: This paper applies character N-grams to the study of diachronic linguistic variation in a historical language. The period selected for this initial exploratory study is medieval English, a well-studied period of great linguistic variation and language contact, whereby the efficacy of computational techniques can be examined through comparison to the wealth of thorough scholarship on medieval linguistic variation. Frequency profiles of character N-gram features were generated for several epochs in the history of English and a measure of language distance was employed to quantify the similarity between English at different stages in its history. Through this a quantification of internal change in English was achieved. Furthermore similarity between English and other medieval languages across time was measured allowing for a measurement of the well-known period of contact between English and Anglo-Norman French. This methodology is compared to traditional lexicostatistical methods and shown to be able to derive the same patterns as those derived from expert-created feature lists (i.e. Swadesh lists). The use of character N-gram profiles proved to be a flexible and useful method to study diachronic variation, allowing for the highlighting of relevant features of change. This method may be a complement to traditional qualitative examinations.
Databáze: OpenAIRE