Flexible parsing, interpretation, and editing of technical sequences with splitcode.

Autor: Sullivan DK; UCLA-Caltech Medical Scientist Training Program, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, United States.; Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, United States., Pachter L; Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, United States.; Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA 91125, United States.
Jazyk: angličtina
Zdroj: Bioinformatics (Oxford, England) [Bioinformatics] 2024 Jun 03; Vol. 40 (6).
DOI: 10.1093/bioinformatics/btae331
Abstrakt: Motivation: Next-generation sequencing libraries are constructed with numerous synthetic constructs such as sequencing adapters, barcodes, and unique molecular identifiers. Such sequences can be essential for interpreting results of sequencing assays, and when they contain information pertinent to an experiment, they must be processed and analyzed.
Results: We present a tool called splitcode, that enables flexible and efficient parsing, interpreting, and editing of sequencing reads. This versatile tool facilitates simple, reproducible preprocessing of reads from libraries constructed for a large array of single-cell and bulk sequencing assays.
Availability and Implementation: The splitcode program is available at http://github.com/pachterlab/splitcode.
(© The Author(s) 2024. Published by Oxford University Press.)
Databáze: MEDLINE