QuickDeconvolution : fast and scalable deconvolution of linked-read sequencing data.

Autor: Faure R; INRIA RBA, CNRS UMR 6074, University of Rennes, Rennes, France., Lavenier D; INRIA RBA, CNRS UMR 6074, University of Rennes, Rennes, France.
Jazyk: angličtina
Zdroj: Bioinformatics advances [Bioinform Adv] 2022 Sep 26; Vol. 2 (1), pp. vbac068. Date of Electronic Publication: 2022 Sep 26 (Print Publication: 2022).
DOI: 10.1093/bioadv/vbac068
Abstrakt: Motivation: Recently introduced, linked-read technologies, such as the 10× chromium system, use microfluidics to tag multiple short reads from the same long fragment (50-200 kb) with a small sequence, called a barcode . They are inexpensive and easy to prepare, combining the accuracy of short-read sequencing with the long-range information of barcodes. The same barcode can be used for several different fragments, which complicates the analyses.
Results: We present QuickDeconvolution (QD), a new software for deconvolving a set of reads sharing a barcode, i.e. separating the reads from the different fragments. QD only takes sequencing data as input, without the need for a reference genome. We show that QD outperforms existing software in terms of accuracy, speed and scalability, making it capable of deconvolving previously inaccessible data sets. In particular, we demonstrate here the first example in the literature of a successfully deconvoluted animal sequencing dataset, a 33-Gb Drosophila melanogaster dataset. We show that the taxonomic assignment of linked reads can be improved by deconvoluting reads with QD before taxonomic classification.
Availability and Implementation: Code and instructions are available on https://github.com/RolandFaure/QuickDeconvolution.
Supplementary Information: Supplementary data are available at Bioinformatics Advances online.
(© The Author(s) 2022. Published by Oxford University Press.)
Databáze: MEDLINE
Nepřihlášeným uživatelům se plný text nezobrazuje