A general framework for moment-based analysis of genetic data

Autor: David J. Balding, Asger Hobolth, Maria Speed
Rok vydání: 2019
Předmět:
Data Analysis
Moments
Mutation rate
Dirichlet
Population
Hierarchical Beta
Datasets as Topic
Population genetics
01 natural sciences
Dirichlet distribution
010305 fluids & plasmas
Diffusion
Evolutionary history
03 medical and health sciences
symbols.namesake
Mutation Rate
0103 physical sciences
Allele fraction
Humans
Quantitative Biology::Populations and Evolution
Applied mathematics
Multi-allelic Wright–Fisher
Computer Simulation
Fraction (mathematics)
education
Alleles
030304 developmental biology
0303 health sciences
education.field_of_study
Models
Genetic

Applied Mathematics
Heavy traffic approximation
Quantitative Biology::Genomics
Agricultural and Biological Sciences (miscellaneous)
Beta–Dirichlet
Moment (mathematics)
Genetics
Population

Pyramid
Modeling and Simulation
Mutation (genetic algorithm)
symbols
Distribution of allele fractions
Mutation processes
Zdroj: Speed, M, Balding, D J & Hobolth, A 2019, ' A general framework for moment-based analysis of genetic data ', Journal of Mathematical Biology, vol. 78, no. 6, pp. 1727-1769 . https://doi.org/10.1007/s00285-018-01325-0
ISSN: 1432-1416
0303-6812
DOI: 10.1007/s00285-018-01325-0
Popis: In population genetics, the Dirichlet (also called the Balding--Nichols) model has for 20 years been considered the key model to approximate the distribution of allele fractions within populations in a multi-allelic setting. It has often been noted that the Dirichlet assumption is approximate because positive correlations among alleles cannot be accommodated under the Dirichlet model. However, the validity of the Dirichlet distribution has never been systematically investigated in a general framework. This paper attempts to address this problem by providing a general overview of how allele fraction data under the most common multi-allelic mutational structures should be modeled. The Dirichlet and alternative models are investigated by simulating allele fractions from a diffusion approximation of the multi-allelic Wright--Fisher process with mutation, and applying a moment-based analysis method. The study shows that the optimal modeling strategy for the distribution of allele fractions depends on the specific mutation process. The Dirichlet model is only an exceptionally good approximation for the pure drift, Jukes--Cantor and parent-independent mutation processes with small mutation rates. Alternative models are required and proposed for the other mutation processes, such as a Beta--Dirichlet model for the infinite alleles mutation process, and a Hierarchical Beta model for the Kimura, Hasegawa--Kishino--Yano and Tamura--Nei processes. Finally, a novel Hierarchical Beta approximation is developed, a Pyramidal Hierarchical Beta model, for the generalized time-reversible and single-step mutation processes.
Databáze: OpenAIRE