SimLoRD : Simulation of Long Read Data
Autor: | Sven Rahmann, Bianca K. Stöcker, Johannes Köster |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2016 |
Předmět: |
0301 basic medicine
Statistics and Probability Computer science 0206 medical engineering Medizin Genomics 02 engineering and technology Biochemistry 03 medical and health sciences Computer Simulation Molecular Biology business.industry High-Throughput Nucleotide Sequencing Sequence Analysis DNA Third generation Computer Science Applications Computational Mathematics 030104 developmental biology Computational Theory and Mathematics Pacific biosciences business 020602 bioinformatics Computer hardware Software Single molecule real time sequencing |
Zdroj: | Bioinformatics, 32(17), 2704-2706 |
ISSN: | 1367-4803 |
Popis: | Motivation: Third generation sequencing methods provide longer reads than second generation methods and have distinct error characteristics. While there exist many read simulators for second generation data, there is a very limited choice for third generation data. Results: We analyzed public data from Pacific Biosciences (PacBio) SMRT sequencing, developed an error model and implemented it in a new read simulator called SimLoRD. It offers options to choose the read length distribution and to model error probabilities depending on the number of passes through the sequencer. The new error model makes SimLoRD the most realistic SMRT read simulator available. Availability and Implementation: SimLoRD is available open source at http://bitbucket.org/genomeinformatics/simlord/ and installable via Bioconda ( http://bioconda.github.io ). Contact: Bianca.Stoecker@uni-due.de or Sven.Rahmann@uni-due.de . Supplementary information: Supplementary data are available at Bioinformatics online. |
Databáze: | OpenAIRE |
Externí odkaz: |