Novel genetic matching methods for handling population stratification in genome-wide association studies
Autor: | Tatsiana Vaitsiakhovich, Frank Jessen, André Lacour, Christine Herold, Wolfgang Maier, Tim Becker, Vitalia Schüller, Alfredo Ramirez, Markus Leber, Dmitriy Drichel, Markus M. Noethen |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2015 |
Předmět: |
Matching (statistics)
population stratification Genotype Computer science Population genetics [Alzheimer Disease] computer.software_genre Population stratification methods [Genome-Wide Association Study] Biochemistry Genome-wide association studies Population Groups Structural Biology Alzheimer Disease Covariate Cluster Analysis Humans ddc:610 education Molecular Biology Blossom algorithm Statistical hypothesis testing education.field_of_study genetic matching Applied Mathematics structured association Computer Science Applications Genetics Population Logistic Models Case-Control Studies Pairwise comparison Data mining Null hypothesis computer Genome-Wide Association Study Research Article |
Zdroj: | BMC bioinformatics 16(1), 84 (2015). doi:10.1186/s12859-015-0521-4 BMC Bioinformatics |
DOI: | 10.1186/s12859-015-0521-4 |
Popis: | Background A usually confronted problem in association studies is the occurrence of population stratification. In this work, we propose a novel framework to consider population matchings in the contexts of genome-wide and sequencing association studies. We employ pairwise and groupwise optimal case-control matchings and present an agglomerative hierarchical clustering, both based on a genetic similarity score matrix. In order to ensure that the resulting matches obtained from the matching algorithm capture correctly the population structure, we propose and discuss two stratum validation methods. We also invent a decisive extension to the Cochran-Armitage Trend test to explicitly take into account the particular population structure. Results We assess our framework by simulations of genotype data under the null hypothesis, to affirm that it correctly controls for the type-1 error rate. By a power study we evaluate that structured association testing using our framework displays reasonable power. We compare our result with those obtained from a logistic regression model with principal component covariates. Using the principal components approaches we also find a possible false-positive association to Alzheimer’s disease, which is neither supported by our new methods, nor by the results of a most recent large meta analysis or by a mixed model approach. Conclusions Matching methods provide an alternative handling of confounding due to population stratification for statistical tests for which covariates are hard to model. As a benchmark, we show that our matching framework performs equally well to state of the art models on common variants. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0521-4) contains supplementary material, which is available to authorized users. |
Databáze: | OpenAIRE |
Externí odkaz: |