Towards a standard benchmark for variant and gene prioritisation algorithms: PhEval - Phenotypic inference Evaluation framework.
Autor: | Bridges Y; William Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, EC1M 6BQ, UK., de Souza V; European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK., Cortes KG; School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA., Haendel M; Department of Genetics, University of North Carolina, Chapel Hill, Chapel Hill, NC, 27599, USA., Harris NL; Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA., Korn DR; Department of Genetics, University of North Carolina, Chapel Hill, Chapel Hill, NC, 27599, USA., Marinakis NM; Laboratory of Medical Genetics, National and Kapodistrian University of Athens, Athens, 11527, Greece., Matentzoglu N; Semanticly, Athens, 10563, Greece., McLaughlin JA; Samples, Phenotypes, and Ontologies (SPOT), European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK., Mungall CJ; Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA., Osumi-Sutherland D; Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, CB10 1SA, UK., Robinson PN; Berlin Institute of Health, Charité - Universitätsmedizin Berlin, Berlin, 10117, Germany., Smedley D; William Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, EC1M 6BQ, UK., Jacobsen JO; William Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, EC1M 6BQ, UK. |
---|---|
Jazyk: | angličtina |
Zdroj: | BioRxiv : the preprint server for biology [bioRxiv] 2024 Jun 16. Date of Electronic Publication: 2024 Jun 16. |
DOI: | 10.1101/2024.06.13.598672 |
Abstrakt: | Background: Computational approaches to support rare disease diagnosis are challenging to build, requiring the integration of complex data types such as ontologies, gene-to-phenotype associations, and cross-species data into variant and gene prioritisation algorithms (VGPAs). However, the performance of VGPAs has been difficult to measure and is impacted by many factors, for example, ontology structure, annotation completeness or changes to the underlying algorithm. Assertions of the capabilities of VGPAs are often not reproducible, in part because there is no standardised, empirical framework and openly available patient data to assess the efficacy of VGPAs - ultimately hindering the development of effective prioritisation tools. Results: In this paper, we present our benchmarking tool, PhEval, which aims to provide a standardised and empirical framework to evaluate phenotype-driven VGPAs. The inclusion of standardised test corpora and test corpus generation tools in the PhEval suite of tools allows open benchmarking and comparison of methods on standardised data sets. Conclusions: PhEval and the standardised test corpora solve the issues of patient data availability and experimental tooling configuration when benchmarking and comparing rare disease VGPAs. By providing standardised data on patient cohorts from real-world case-reports and controlling the configuration of evaluated VGPAs, PhEval enables transparent, portable, comparable and reproducible benchmarking of VGPAs. As these tools are often a key component of many rare disease diagnostic pipelines, a thorough and standardised method of assessment is essential for improving patient diagnosis and care. |
Databáze: | MEDLINE |
Externí odkaz: |