SEMI-PARAMETRIC COVARIATE-MODULATED LOCAL FALSE DISCOVERY RATE FOR GENOME-WIDE ASSOCIATION STUDIES
Autor: | Wesley K. Thompson, Rong W. Zablocki, Andrew J. Schork, Shujing Xu, Yunpeng Wang, Chun Chieh Fan, Richard A. Levine |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2017 |
Předmět: |
False discovery rate
Genetics 0303 health sciences Bayesian probability Markov chain Monte Carlo Computational biology Biology Mixture model 01 natural sciences Semiparametric model 010104 statistics & probability 03 medical and health sciences symbols.namesake Covariate symbols Multinomial distribution 0101 mathematics 030304 developmental biology Statistical hypothesis testing |
DOI: | 10.1101/183384 |
Popis: | While genome-wide association studies (GWAS) have discovered thousands of risk loci for heritable disorders, so far even very large meta-analyses have recovered only a fraction of the heritability of most complex traits. Recent work utilizing variance components models has demonstrated that a larger fraction of the heritability of complex phenotypes is captured by the additive effects of SNPs than is evident only in loci surpassing genome-wide significance thresholds, typically set at a Bonferroni-inspired p ≤ 5 x 10-8. Procedures that control false discovery rate can be more powerful, yet these are still under-powered to detect the majority of non-null effects from GWAS. The current work proposes a novel Bayesian semi-parametric two-group mixture model and develops a Markov Chain Monte Carlo (MCMC) algorithm for a covariate-modulated local false discovery rate (cmfdr). The probability of being non-null depends on a set of covariates via a logistic function, and the non-null distribution is approximated as a linear combination of B-spline densities, where the weight of each B-spline density depends on a multinomial function of the covariates. The proposed methods were motivated by work on a large meta-analysis of schizophrenia GWAS performed by the Psychiatric Genetics Consortium (PGC). We show that the new cmfdr model fits the PGC schizophrenia GWAS test statistics well, performing better than our previously proposed parametric gamma model for estimating the non-null density and substantially improving power over usual fdr. Using loci declared significant at cmfdr ≤ 0.20, we perform follow-up pathway analyses using the Kyoto Encyclopedia of Genes and Genomes (KEGG) homo sapiens pathways database. We demonstrate that the increased yield from the cmfdr model results in an improved ability to test for pathways associated with schizophrenia compared to using those SNPs selected according to usual fdr. |
Databáze: | OpenAIRE |
Externí odkaz: |