Popis: |
Introduction: There remains a need to reduce next-generation sequencing (NGS) turnaround for time-sensitive applications. Reducing turnaround requires faster sequencing and accelerated data analysis. We recently introduced the Singular Genomics G4TM platform for rapid sequencing-by-synthesis (SBS), which can deliver four human whole genomes at ~30x coverage in 19 hours. Here we present accelerated bioinformatics pipelines for germline and somatic variant detection on the G4 that leverage the NVIDIA Clara Parabricks platform and Google DeepVariant. Methods: The G4 sequencer enables 2x150bp reads from two flow cell types (F2: 150M and F3: 300M reads) and up to four flowcells may analyzed in parallel. To maximize speed and performance for germline variant detection, we trained a custom DeepVariant v1.4 whole genome model for the Parabricks platform using data from GIAB reference samples, excluding HG002 (2x150bp reads; multiple library preparation kits used in data generation). We iteratively explored DeepVariant model parameters before validating performance on HG002. Separately, we applied the Parabricks umi_fgbio workflow to perform single family UMI error correction and somatic variant detection following targeted enrichment and sequencing of control gDNA consisting of pooled reference cell lines. Results: The baseline DeepVariant v1.4 Illumina whole genome model delivered a precision and recall of 99.86% and 99.12% for SNPs, and 98.37% and 96.27% for indels, respectively for HG002 at 31x coverage. The trained model showed improved indel performance, with a precision and recall of 99.86% and 99.10% for SNPs, and 98.56% and 96.81% for indels, respectively from the same library, driven by gains in performance over low complexity and homopolymer-rich regions. The model was adapted to Parabricks to deliver a fastq-to-vcf turnaround of 30 minutes for 30x whole genome analysis. Separately, for somatic variant detection, single family UMI-based error correction and variant calling from 150M input reads was completed in 45 minutes on the Parabricks platform using the p4d.24xlarge GPU instance type. We observe a high concordance between observed and predicted variant allele frequencies (R2 = 0.990, minimum VAF >=1%). Conclusions: We have successfully implemented a custom GPU-accelerated DeepVariant whole genome model for the G4, resulting in improved indel performance over challenging genomic features. We further demonstrated accelerated single family UMI error correction and somatic variant detection via the Parabricks umi_fgbio workflow. We anticipate that the combination of rapid-SBS and GPU-based acceleration will significantly reduce turnaround for the most time sensitive variant detection applications. Citation Format: Kenneth Gouin, Liz LaMarca, Ann Tong, Ryan Shultzaberger, Timothy Looney, Martin M. Fabani. Rapid somatic and germline variant detection using the novel G4 sequencing platform and GPU acceleration [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 2067. |