Detecting copy number variation with mated short reads
Autor: | Michael Brudno, Paul Medvedev, Tim Smith, Marc Fiume, Misko Dzamba |
---|---|
Rok vydání: | 2010 |
Předmět: |
Resource
Genetics Base Sequence DNA Copy Number Variations DNA Mutational Analysis Breakpoint Copy number analysis Chromosome Mapping High-Throughput Nucleotide Sequencing Reproducibility of Results Chromosome Breakage DNA Shuffling Computational biology Biology Genome Identification (information) Humans Graph (abstract data type) Human genome Copy-number variation Chromosome breakage Base Pairing Algorithms Genetics (clinical) |
Zdroj: | Genome Research. 20:1613-1622 |
ISSN: | 1088-9051 |
DOI: | 10.1101/gr.106344.110 |
Popis: | The development of high-throughput sequencing (HTS) technologies has opened the door to novel methods for detecting copy number variants (CNVs) in the human genome. While in the past CNVs have been detected based on array CGH data, recent studies have shown that depth-of-coverage information from HTS technologies can also be used for the reliable identification of large copy-variable regions. Such methods, however, are hindered by sequencing biases that lead certain regions of the genome to be over- or undersampled, lowering their resolution and ability to accurately identify the exact breakpoints of the variants. In this work, we develop a method for CNV detection that supplements the depth-of-coverage with paired-end mapping information, where mate pairs mapping discordantly to the reference serve to indicate the presence of variation. Our algorithm, called CNVer, combines this information within a unified computational framework called the donor graph, allowing us to better mitigate the sequencing biases that cause uneven local coverage and accurately predict CNVs. We use CNVer to detect 4879 CNVs in the recently described genome of a Yoruban individual. Most of the calls (77%) coincide with previously known variants within the Database of Genomic Variants, while 81% of deletion copy number variants previously known for this individual coincide with one of our loss calls. Furthermore, we demonstrate that CNVer can reconstruct the absolute copy counts of segments of the donor genome and evaluate the feasibility of using CNVer with low coverage datasets. |
Databáze: | OpenAIRE |
Externí odkaz: |