Nonconvex regularization for sparse genomic variant signal detection

Autor: Roummel F. Marcia, Katharine Sanderson, Jonathan Sahagun, Mario Banuelos, Rubi Almanza, Andrew Fujikawa, Melissa Spence, Suzanne Sindi, Lasith Adhikari
Rok vydání: 2017
Předmět:
Zdroj: MeMeA
Popis: Recent research suggests an overwhelming proportion of humans have genomic structural variants (SVs): rearrangements of regions in the genome such as inversions, insertions, deletions and duplications. The standard approach to detecting SVs in an unknown genome involves sequencing paired-reads from the genome in question, mapping them to a reference genome, and analyzing the resulting configuration of fragments for evidence of rearrangements. Because SVs occur relatively infrequently in the human genome, and erroneous read-mappings may suggest the presence of an SV, approaches to SV detection typically suffer from high false-positive rates. Our approach aims to more accurately distinguish true from false SVs in two ways: First, we solve a constrained optimization equation consisting of a negative Poisson log-likelihood objective function with an additive penalty term that promotes sparsity. Second, we analyze multiple related individuals simultaneously and enforce familial constraints. That is, we require any SVs predicted in children to be present in one of their parents. Our problem formulation decreases the false positive rate despite a large amount of error from both DNA sequencing and mapping. By incorporating additional information, we improve our model formulation and increase the accuracy of SV prediction methods.
Databáze: OpenAIRE