Hypothesis generation for rare and undiagnosed diseases through clustering and classifying time-versioned biological ontologies.
Autor: | Bradshaw MS; Department of Computer Science, University of Colorado Boulder, Boulder, CO, United States of America., Gibbs C; Department of Statistics, Colorado State University, Fort Collins, CO, United States of America., Martin S; Department of Computer Science, University of Colorado Boulder, Boulder, CO, United States of America., Firman T; Precision Medicine Institute, Children's Hospital Colorado, Aurora, CO, United States of America., Gaskell A; Precision Medicine Institute, Children's Hospital Colorado, Aurora, CO, United States of America., Fosdick B; Department of Biostatistics & Informatics, Colorado School of Public Health, Aurora, CO, United States of America., Layer R; Department of Computer Science, University of Colorado Boulder, Boulder, CO, United States of America. |
---|---|
Jazyk: | angličtina |
Zdroj: | PloS one [PLoS One] 2024 Dec 26; Vol. 19 (12), pp. e0309205. Date of Electronic Publication: 2024 Dec 26 (Print Publication: 2024). |
DOI: | 10.1371/journal.pone.0309205 |
Abstrakt: | Rare diseases affect 1-in-10 people in the United States and despite increased genetic testing, up to half never receive a diagnosis. Even when using advanced genome sequencing platforms to discover variants, if there is no connection between the variants found in the patient's genome and their phenotypes in the literature, then the patient will remain undiagnosed. When a direct variant-phenotype connection is not known, putting a patient's information in the larger context of phenotype relationships and protein-protein interactions may provide an opportunity to find an indirect explanation. Databases such as STRING contain millions of protein-protein interactions, and the Human Phenotype Ontology (HPO) contains the relations of thousands of phenotypes. By integrating these networks and clustering the entities within, we can potentially discover latent gene-to-phenotype connections. The historical records for STRING and HPO provide a unique opportunity to create a network time series for evaluating the cluster significance. Most excitingly, working with Children's Hospital Colorado, we have provided promising hypotheses about latent gene-to-phenotype connections for 38 patients. We also provide potential answers for 14 patients listed on MyGene2. Clusters our tool finds significant harbor 2.35 to 8.72 times as many gene-to-phenotype edges inferred from known drug interactions than clusters found to be insignificant. Our tool, BOCC, is available as a web app and command line tool. Competing Interests: The authors have declared that no competing interests exist. (Copyright: © 2024 Bradshaw et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.) |
Databáze: | MEDLINE |
Externí odkaz: |