Simulation of undiagnosed patients with novel genetic conditions.

Autor: Alsentzer E; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, 02115, USA.; Program in Health Sciences and Technology, MIT, Cambridge, MA, 02139, USA., Finlayson SG; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, 02115, USA.; Program in Health Sciences and Technology, MIT, Cambridge, MA, 02139, USA.; Department of Pediatrics, Division of Genetic Medicine, Seattle Children's Hospital, Seattle, WA, 98105, USA.; Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA, 98105, USA., Li MM; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, 02115, USA.; Bioinformatics and Integrative Genomics, Harvard Medical School, Boston, MA, 02115, USA., Kobren SN; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, 02115, USA. Shilpa_Kobren@hms.harvard.edu., Kohane IS; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, 02115, USA. Isaac_Kohane@hms.harvard.edu.
Jazyk: angličtina
Zdroj: Nature communications [Nat Commun] 2023 Oct 12; Vol. 14 (1), pp. 6403. Date of Electronic Publication: 2023 Oct 12.
DOI: 10.1038/s41467-023-41980-6
Abstrakt: Rare Mendelian disorders pose a major diagnostic challenge and collectively affect 300-400 million patients worldwide. Many automated tools aim to uncover causal genes in patients with suspected genetic disorders, but evaluation of these tools is limited due to the lack of comprehensive benchmark datasets that include previously unpublished conditions. Here, we present a computational pipeline that simulates realistic clinical datasets to address this deficit. Our framework jointly simulates complex phenotypes and challenging candidate genes and produces patients with novel genetic conditions. We demonstrate the similarity of our simulated patients to real patients from the Undiagnosed Diseases Network and evaluate common gene prioritization methods on the simulated cohort. These prioritization methods recover known gene-disease associations but perform poorly on diagnosing patients with novel genetic disorders. Our publicly-available dataset and codebase can be utilized by medical genetics researchers to evaluate, compare, and improve tools that aid in the diagnostic process.
(© 2023. Springer Nature Limited.)
Databáze: MEDLINE