Comprehensive genome analysis and variant detection at scale using DRAGEN.

Autor: Behera S; Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA., Catreux S; Illumina, Inc., San Diego, CA, USA. scatreux@illumina.com., Rossi M; Illumina, Inc., San Diego, CA, USA., Truong S; Illumina, Inc., San Diego, CA, USA., Huang Z; Illumina, Inc., San Diego, CA, USA., Ruehle M; Illumina, Inc., San Diego, CA, USA., Visvanath A; Illumina, Inc., San Diego, CA, USA., Parnaby G; Illumina, Inc., San Diego, CA, USA., Roddey C; Illumina, Inc., San Diego, CA, USA., Onuchic V; Illumina, Inc., San Diego, CA, USA., Finocchio A; Illumina, Inc., San Diego, CA, USA., Cameron DL; Illumina, Inc., San Diego, CA, USA., English A; Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA., Mehtalia S; Illumina, Inc., San Diego, CA, USA., Han J; Illumina, Inc., San Diego, CA, USA. jhan6@illumina.com., Mehio R; Illumina, Inc., San Diego, CA, USA. rmehio@illumina.com., Sedlazeck FJ; Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA. fritz.sedlazeck@bcm.edu.; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA. fritz.sedlazeck@bcm.edu.; Department of Computer Science, Rice University, Houston, TX, USA. fritz.sedlazeck@bcm.edu.
Jazyk: angličtina
Zdroj: Nature biotechnology [Nat Biotechnol] 2024 Oct 25. Date of Electronic Publication: 2024 Oct 25.
DOI: 10.1038/s41587-024-02382-1
Abstrakt: Research and medical genomics require comprehensive, scalable methods for the discovery of novel disease targets, evolutionary drivers and genetic markers with clinical significance. This necessitates a framework to identify all types of variants independent of their size or location. Here we present DRAGEN, which uses multigenome mapping with pangenome references, hardware acceleration and machine learning-based variant detection to provide insights into individual genomes, with ~30 min of computation time from raw reads to variant detection. DRAGEN outperforms current state-of-the-art methods in speed and accuracy across all variant types (single-nucleotide variations, insertions or deletions, short tandem repeats, structural variations and copy number variations) and incorporates specialized methods for analysis of medically relevant genes. We demonstrate the performance of DRAGEN across 3,202 whole-genome sequencing datasets by generating fully genotyped multisample variant call format files and demonstrate its scalability, accuracy and innovation to further advance the integration of comprehensive genomics. Overall, DRAGEN marks a major milestone in sequencing data analysis and will provide insights across various diseases, including Mendelian and rare diseases, with a highly comprehensive and scalable platform.
(© 2024. The Author(s).)
Databáze: MEDLINE