Prioritizing Crohn’s disease genes by integrating association signals with gene expression implicates monocyte subsets
Autor: | Judy H. Cho, Ephraim Kenigsberg, Wallace Crandall, Ling-Shiang Chuang, Subra Kugathasan, Clara Abraham, Lee A. Denson, Joshua D. Noe, Nai Yun Hsu, Anne M. Griffiths, Jeffrey S. Hyams, Kyle Gettler, Mamta Giri, Richard Kellermayer, Jerome Martin, David R. Mack, Gabriel E. Hoffman |
---|---|
Rok vydání: | 2019 |
Předmět: |
0301 basic medicine
Sequence analysis Immunology Locus (genetics) Genome-wide association study Disease Computational biology Biology 03 medical and health sciences 030104 developmental biology 0302 clinical medicine Gene expression Genetics Epigenetics Gene Genetics (clinical) 030215 immunology Genetic association |
Zdroj: | Genes & Immunity. 20:577-588 |
ISSN: | 1476-5470 1466-4879 |
Popis: | Genome-wide association studies have identified ~170 loci associated with Crohn's disease (CD) and defining which genes drive these association signals is a major challenge. The primary aim of this study was to define which CD locus genes are most likely to be disease related. We developed a gene prioritization regression model (GPRM) by integrating complementary mRNA expression datasets, including bulk RNA-Seq from the terminal ileum of 302 newly diagnosed, untreated CD patients and controls, and in stimulated monocytes. Transcriptome-wide association and co-expression network analyses were performed on the ileal RNA-Seq datasets, identifying 40 genome-wide significant genes. Co-expression network analysis identified a single gene module, which was substantially enriched for CD locus genes and most highly expressed in monocytes. By including expression-based and epigenetic information, we refined likely CD genes to 2.5 prioritized genes per locus from an average of 7.8 total genes. We validated our model structure using cross-validation and our prioritization results by protein-association network analyses, which demonstrated significantly higher CD gene interactions for prioritized compared with non-prioritized genes. Although individual datasets cannot convey all of the information relevant to a disease, combining data from multiple relevant expression-based datasets improves prediction of disease genes and helps to further understanding of disease pathogenesis. |
Databáze: | OpenAIRE |
Externí odkaz: |