Structured Genome-Wide Association Studies with Bayesian Hierarchical Variable Selection

Autor: Hongtu Zhu, Yize Zhao, Fei Zou, Zhaohua Lu, Rebecca C. Knickmeyer
Jazyk: angličtina
Rok vydání: 2019
Předmět:
Genetic Markers
Bayesian probability
Inference
Genome-wide association study
Feature selection
Neuroimaging
Biology
Investigations
Machine learning
computer.software_genre
01 natural sciences
Polymorphism
Single Nucleotide

SNP-set
Set (abstract data type)
010104 statistics & probability
03 medical and health sciences
symbols.namesake
Alzheimer Disease
Genetics
Humans
Computer Simulation
0101 mathematics
Selection (genetic algorithm)
030304 developmental biology
Genetic association
Bayesian variable selection
0303 health sciences
Models
Genetic

business.industry
Markov chain Monte Carlo
Bayes Theorem
Markov Chains
Phenotype
genome-wide association studies
symbols
imaging genetics
Artificial intelligence
business
computer
Statistical Genetics and Genomics
Algorithms
Genome-Wide Association Study
Zdroj: Genetics
ISSN: 1943-2631
0016-6731
Popis: It becomes increasingly important in using genome-wide association studies (GWAS) to select important genetic information associated with qualitative or quantitative traits. Currently, the discovery of biological association among SNPs motivates various strategies to construct SNP-sets along the genome and to incorporate such set information into selection procedure for a higher selection power, while facilitating more biologically meaningful results. The aim of this paper is to propose a novel Bayesian framework for hierarchical variable selection at both SNP-set (group) level and SNP (within group) level. We overcome a key limitation of existing posterior updating scheme in most Bayesian variable selection methods by proposing a novel sampling scheme to explicitly accommodate the ultrahigh-dimensionality of genetic data. Specifically, by constructing an auxiliary variable selection model under SNP-set level, the new procedure utilizes the posterior samples of the auxiliary model to subsequently guide the posterior inference for the targeted hierarchical selection model. We apply the proposed method to a variety of simulation studies and show that our method is computationally efficient and achieves substantially better performance than competing approaches in both SNP-set and SNP selection. Applying the method to the Alzheimers Disease Neuroimaging Initiative (ADNI) data, we identify biologically meaningful genetic factors under several neuroimaging volumetric phenotypes. Our method is general and readily to be applied to a wide range of biomedical studies.
Databáze: OpenAIRE