Phylogenetic signal in phonotactics
Autor: | Jayden L. Macklin-Cordes, Claire Bowern, Erich R. Round |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2020 |
Předmět: |
FOS: Computer and information sciences
0106 biological sciences Linguistics and Language Computer science Inference computer.software_genre Lexicon J.5 010603 evolutionary biology 01 natural sciences Language and Linguistics 03 medical and health sciences Quantitative Biology - Populations and Evolution Comparative linguistics 030304 developmental biology Phonotactics 0303 health sciences Computer Science - Computation and Language Phylogenetic tree business.industry Populations and Evolution (q-bio.PE) Phonology Phylogenetic comparative methods FOS: Biological sciences Binary data Artificial intelligence business Computation and Language (cs.CL) computer Natural language processing |
Zdroj: | Diachronica |
Popis: | Phylogenetic methods have broad potential in linguistics beyond tree inference. Here, we show how a phylogenetic approach opens the possibility of gaining historical insights from entirely new kinds of linguistic data--in this instance, statistical phonotactics. We extract phonotactic data from 111 Pama-Nyungan vocabularies and apply tests for phylogenetic signal, quantifying the degree to which the data reflect phylogenetic history. We test three datasets: (1) binary variables recording the presence or absence of biphones (two-segment sequences) in a lexicon (2) frequencies of transitions between segments, and (3) frequencies of transitions between natural sound classes. Australian languages have been characterized as having a high degree of phonotactic homogeneity. Nevertheless, we detect phylogenetic signal in all datasets. Phylogenetic signal is greater in finer-grained frequency data than in binary data, and greatest in natural-class-based data. These results demonstrate the viability of employing a new source of readily extractable data in historical and comparative linguistics. Main text: 32 pages, 17 figures, 1 table. Supplementary Information: 17 pages, 1 figure. Code and data available at http://doi.org/10.5281/zenodo.3936353. This article is in review but not yet accepted for publication in a journal |
Databáze: | OpenAIRE |
Externí odkaz: |