Algorithmic improvements for discovery of germline copy number variants in next-generation sequencing data

Autor: Brendan O’Fallon, Jacob Durtschi, Ana Kellogg, Tracey Lewis, Devin Close, Hunter Best
Jazyk: angličtina
Rok vydání: 2022
Předmět:
Zdroj: BMC Bioinformatics, Vol 23, Iss 1, Pp 1-14 (2022)
Druh dokumentu: article
ISSN: 1471-2105
DOI: 10.1186/s12859-022-04820-w
Popis: Abstract Background Copy number variants (CNVs) play a significant role in human heredity and disease. However, sensitive and specific characterization of germline CNVs from NGS data has remained challenging, particularly for hybridization-capture data in which read counts are the primary source of copy number information. Results We describe two algorithmic adaptations that improve CNV detection accuracy in a Hidden Markov Model (HMM) context. First, we present a method for computing target- and copy number-specific emission distributions. Second, we demonstrate that the Pointwise Maximum a posteriori (PMAP) HMM decoding procedure yields improved sensitivity for small CNV calls compared to the more common Viterbi HMM decoder. We develop a prototype implementation, called Cobalt, and compare it to other CNV detection tools using sets of simulated and previously detected CNVs with sizes spanning a single exon to a full chromosome. Conclusions In both the simulation and previously detected CNV studies Cobalt shows similar sensitivity but significantly fewer false positive detections compared to other callers. Overall sensitivity is 80–90% for deletion CNVs spanning 1–4 targets and 90–100% for larger deletion events, while sensitivity is somewhat lower for small duplication CNVs.
Databáze: Directory of Open Access Journals
Nepřihlášeným uživatelům se plný text nezobrazuje