Mutation discovery in regions of segmental cancer genome amplifications with CoNAn-SNV: a mixture model for next generation sequencing of tumors
Autor: | Samuel Aparicio, Arusha Oloumi, Marco A. Marra, Martin Hirst, Jiarui Ding, Rodrigo Goya, Thomas Zeng, Anamaria Crisan, Kane Tse, Leah M Prentice, Sohrab P. Shah, Allen Delaney, Janine Senz, David G. Huntsman, Gavin Ha |
---|---|
Rok vydání: | 2011 |
Předmět: |
DNA Copy Number Variations
Genetic Causes of Cancer Copy number analysis lcsh:Medicine Locus (genetics) Biology Genome DNA sequencing 03 medical and health sciences 0302 clinical medicine Germline mutation Genome Analysis Tools Neoplasms Basic Cancer Research Genetics Cancer Genetics Humans Genome Sequencing lcsh:Science Gene 030304 developmental biology Segmental duplication 0303 health sciences Multidisciplinary Models Genetic Shotgun sequencing Cancer Risk Factors lcsh:R Computational Biology Genomics Genome Scans Oncology 030220 oncology & carcinogenesis Mutation Medicine lcsh:Q Algorithms Genes Neoplasm Research Article |
Zdroj: | PLoS ONE PLoS ONE, Vol 7, Iss 8, p e41551 (2012) |
ISSN: | 1932-6203 |
Popis: | Next generation sequencing has now enabled a cost-effective enumeration of the full mutational complement of a tumor genome-in particular single nucleotide variants (SNVs). Most current computational and statistical models for analyzing next generation sequencing data, however, do not account for cancer-specific biological properties, including somatic segmental copy number alterations (CNAs)-which require special treatment of the data. Here we present CoNAn-SNV (Copy Number Annotated SNV): a novel algorithm for the inference of single nucleotide variants (SNVs) that overlap copy number alterations. The method is based on modelling the notion that genomic regions of segmental duplication and amplification induce an extended genotype space where a subset of genotypes will exhibit heavily skewed allelic distributions in SNVs (and therefore render them undetectable by methods that assume diploidy). We introduce the concept of modelling allelic counts from sequencing data using a panel of Binomial mixture models where the number of mixtures for a given locus in the genome is informed by a discrete copy number state given as input. We applied CoNAn-SNV to a previously published whole genome shotgun data set obtained from a lobular breast cancer and show that it is able to discover 21 experimentally revalidated somatic non-synonymous mutations in a lobular breast cancer genome that were not detected using copy number insensitive SNV detection algorithms. Importantly, ROC analysis shows that the increased sensitivity of CoNAn-SNV does not result in disproportionate loss of specificity. This was also supported by analysis of a recently published lymphoma genome with a relatively quiescent karyotype, where CoNAn-SNV showed similar results to other callers except in regions of copy number gain where increased sensitivity was conferred. Our results indicate that in genomically unstable tumors, copy number annotation for SNV detection will be critical to fully characterize the mutational landscape of cancer genomes. |
Databáze: | OpenAIRE |
Externí odkaz: |