DeepSomatic: Accurate somatic small variant discovery for multiple sequencing technologies.
Autor: | Park J; UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA., Cook DE; Google Inc, Mountain View, CA, USA., Chang PC; Google Inc, Mountain View, CA, USA., Kolesnikov A; Google Inc, Mountain View, CA, USA., Brambrink L; Google Inc, Mountain View, CA, USA., Mier JC; Google Inc, Mountain View, CA, USA., Gardner J; UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA., McNulty B; UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA., Sacco S; UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA., Keskus A; Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA., Bryant A; Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA., Ahmad T; Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA., Shetty J; Sequencing Facility, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick, MD, USA., Zhao Y; Sequencing Facility Bioinformatics Group, Biomedical Informatics and Data Science Directorate, Frederick National Laboratory for Cancer Research, Frederick, MD, USA., Tran B; Sequencing Facility, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick, MD, USA., Narzisi G; New York Genome Center, NY, USA., Helland A; New York Genome Center, NY, USA., Yoo B; Children's Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA., Pushel I; Children's Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA., Lansdon LA; Children's Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA., Bi C; Children's Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA., Walter A; Children's Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA., Gibson M; Children's Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA., Pastinen T; Children's Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA., Farooqi MS; Children's Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA., Robine N; New York Genome Center, NY, USA., Miga KH; UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA., Carroll A; Google Inc, Mountain View, CA, USA., Kolmogorov M; Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA., Paten B; UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA., Shafin K; Google Inc, Mountain View, CA, USA. |
---|---|
Jazyk: | angličtina |
Zdroj: | BioRxiv : the preprint server for biology [bioRxiv] 2024 Aug 19. Date of Electronic Publication: 2024 Aug 19. |
DOI: | 10.1101/2024.08.16.608331 |
Abstrakt: | Somatic variant detection is an integral part of cancer genomics analysis. While most methods have focused on short-read sequencing, long-read technologies now offer potential advantages in terms of repeat mapping and variant phasing. We present DeepSomatic, a deep learning method for detecting somatic SNVs and insertions and deletions (indels) from both short-read and long-read data, with modes for whole-genome and exome sequencing, and able to run on tumor-normal, tumor-only, and with FFPE-prepared samples. To help address the dearth of publicly available training and benchmarking data for somatic variant detection, we generated and make openly available a dataset of five matched tumor-normal cell line pairs sequenced with Illumina, PacBio HiFi, and Oxford Nanopore Technologies, along with benchmark variant sets. Across samples and technologies (short-read and long-read), DeepSomatic consistently outperforms existing callers, particularly for indels. Competing Interests: Competing interests K.S., D.E.C., P.C., A. Kolesnikov, L.B., J.C.M., and A.C. are employees of Google LLC and own Alphabet stock as part of the standard compensation package. M.S.F. is a part of the speakers bureau for Bayer and PacBio. |
Databáze: | MEDLINE |
Externí odkaz: |