Inferring gene and protein interactions using PubMed citations and consensus Bayesian networks

Autor: Zhong-Hui Duan, Mark R. Dalman, Joseph S. Haddad, Anthony Deeter
Rok vydání: 2017
Předmět:
0301 basic medicine
Cell signaling
Muscle Physiology
Physiology
Gene Identification and Analysis
lcsh:Medicine
Datasets as Topic
Genetic Networks
Signal transduction
Bioinformatics
Biochemistry
Bayes' theorem
Immune Physiology
Protein Interaction Mapping
Medicine and Health Sciences
lcsh:Science
Musculoskeletal System
media_common
Innate Immune System
Multidisciplinary
Protein Kinase Signaling Cascade
Muscles
Signaling cascades
STAT signaling
Cytokines
Anatomy
Network Analysis
Research Article
Muscle Contraction
PubMed
Cell biology
Computer and Information Sciences
media_common.quotation_subject
Immunology
Computational biology
Biology
Set (abstract data type)
Gene product
03 medical and health sciences
Ingenuity
Genetics
KEGG
Protein Interactions
Cardiac Muscles
Biology and life sciences
lcsh:R
Experimental data
Bayesian network
Proteins
Bayes Theorem
Molecular Development
030104 developmental biology
JAK-STAT signaling cascade
Immune System
lcsh:Q
Developmental Biology
Zdroj: PLoS ONE
PLoS ONE, Vol 12, Iss 10, p e0186004 (2017)
ISSN: 1932-6203
Popis: The PubMed database offers an extensive set of publication data that can be useful, yet inherently complex to use without automated computational techniques. Data repositories such as the Genomic Data Commons (GDC) and the Gene Expression Omnibus (GEO) offer experimental data storage and retrieval as well as curated gene expression profiles. Genetic interaction databases, including Reactome and Ingenuity Pathway Analysis, offer pathway and experiment data analysis using data curated from these publications and data repositories. We have created a method to generate and analyze consensus networks, inferring potential gene interactions, using large numbers of Bayesian networks generated by data mining publications in the PubMed database. Through the concept of network resolution, these consensus networks can be tailored to represent possible genetic interactions. We designed a set of experiments to confirm that our method is stable across variation in both sample and topological input sizes. Using gene product interactions from the KEGG pathway database and data mining PubMed publication abstracts, we verify that regardless of the network resolution or the inferred consensus network, our method is capable of inferring meaningful gene interactions through consensus Bayesian network generation with multiple, randomized topological orderings. Our method can not only confirm the existence of currently accepted interactions, but has the potential to hypothesize new ones as well. We show our method confirms the existence of known gene interactions such as JAK-STAT-PI3K-AKT-mTOR, infers novel gene interactions such as RAS- Bcl-2 and RAS-AKT, and found significant pathway-pathway interactions between the JAK-STAT signaling and Cardiac Muscle Contraction KEGG pathways.
Databáze: OpenAIRE