Causal graph-based analysis of genome-wide association data in rheumatoid arthritis

Autor: Bo Ding, Jizhou Ai, Alexander Statnikov, Nikita Igorevych Lytkin, Alexander V. Alekseyenko, Leonid Padyukov, Constantin F. Aliferis
Rok vydání: 2011
Předmět:
Canada
Multivariate statistics
Immunology
Inference
Genomics
Single-nucleotide polymorphism
Genome-wide association study
Disease
Computational biology
Biology
Models
Biological

Polymorphism
Single Nucleotide

General Biochemistry
Genetics and Molecular Biology

Arthritis
Rheumatoid

Major Histocompatibility Complex
03 medical and health sciences
0302 clinical medicine
Humans
SNP
lcsh:QH301-705.5
Ecology
Evolution
Behavior and Systematics

030304 developmental biology
Sweden
Genetics
0303 health sciences
Agricultural and Biological Sciences(all)
Biochemistry
Genetics and Molecular Biology(all)

Gene Expression Profiling
Research
Applied Mathematics
Computational Biology
United States
3. Good health
lcsh:Biology (General)
Conditional independence
030220 oncology & carcinogenesis
Modeling and Simulation
General Agricultural and Biological Sciences
Algorithms
Genome-Wide Association Study
Zdroj: Biology Direct
Biology Direct, Vol 6, Iss 1, p 25 (2011)
ISSN: 1745-6150
DOI: 10.1186/1745-6150-6-25
Popis: Background GWAS owe their popularity to the expectation that they will make a major impact on diagnosis, prognosis and management of disease by uncovering genetics underlying clinical phenotypes. The dominant paradigm in GWAS data analysis so far consists of extensive reliance on methods that emphasize contribution of individual SNPs to statistical association with phenotypes. Multivariate methods, however, can extract more information by considering associations of multiple SNPs simultaneously. Recent advances in other genomics domains pinpoint multivariate causal graph-based inference as a promising principled analysis framework for high-throughput data. Designed to discover biomarkers in the local causal pathway of the phenotype, these methods lead to accurate and highly parsimonious multivariate predictive models. In this paper, we investigate the applicability of causal graph-based method TIE* to analysis of GWAS data. To test the utility of TIE*, we focus on anti-CCP positive rheumatoid arthritis (RA) GWAS datasets, where there is a general consensus in the community about the major genetic determinants of the disease. Results Application of TIE* to the North American Rheumatoid Arthritis Cohort (NARAC) GWAS data results in six SNPs, mostly from the MHC locus. Using these SNPs we develop two predictive models that can classify cases and disease-free controls with an accuracy of 0.81 area under the ROC curve, as verified in independent testing data from the same cohort. The predictive performance of these models generalizes reasonably well to Swedish subjects from the closely related but not identical Epidemiological Investigation of Rheumatoid Arthritis (EIRA) cohort with 0.71-0.78 area under the ROC curve. Moreover, the SNPs identified by the TIE* method render many other previously known SNP associations conditionally independent of the phenotype. Conclusions Our experiments demonstrate that application of TIE* captures maximum amount of genetic information about RA in the data and recapitulates the major consensus findings about the genetic factors of this disease. In addition, TIE* yields reproducible markers and signatures of RA. This suggests that principled multivariate causal and predictive framework for GWAS analysis empowers the community with a new tool for high-quality and more efficient discovery. Reviewers This article was reviewed by Prof. Anthony Almudevar, Dr. Eugene V. Koonin, and Prof. Marianthi Markatou.
Databáze: OpenAIRE