Pairtools: from sequencing data to chromosome contacts.

Autor: Abdennur N; Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, 01605, MA.; Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, MA, 01605, USA., Fudenberg G; Department of Computational and Quantitative Biology, University of Southern California, Los Angeles, CA, USA., Flyamer IM; Friedrich Miescher Institute for Biomedical Research, Maulbeerstrasse 66, CH-4058 Basel, Switzerland., Galitsyna AA; Institute for Medical Engineering and Sciences, Massachusetts Institute of Technology (MIT), Cambridge, MA, 02139, USA.; Institute of Molecular Biotechnology of the Austrian Academy of Sciences (IMBA), Vienna BioCenter (VBC), Dr. Bohr-Gasse 3, 1030 Vienna, Austria., Goloborodko A; Institute of Molecular Biotechnology of the Austrian Academy of Sciences (IMBA), Vienna BioCenter (VBC), Dr. Bohr-Gasse 3, 1030 Vienna, Austria., Imakaev M; Institute for Medical Engineering and Sciences, Massachusetts Institute of Technology (MIT), Cambridge, MA, 02139, USA., Venev SV; Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, MA, 01605, USA.
Jazyk: angličtina
Zdroj: BioRxiv : the preprint server for biology [bioRxiv] 2023 Feb 15. Date of Electronic Publication: 2023 Feb 15.
DOI: 10.1101/2023.02.13.528389
Abstrakt: The field of 3D genome organization produces large amounts of sequencing data from Hi-C and a rapidly-expanding set of other chromosome conformation protocols (3C+). Massive and heterogeneous 3C+ data require high-performance and flexible processing of sequenced reads into contact pairs. To meet these challenges, we present pairtools - a flexible suite of tools for contact extraction from sequencing data. Pairtools provides modular command-line interface (CLI) tools that can be flexibly chained into data processing pipelines. Pairtools provides both crucial core tools as well as auxiliary tools for building feature-rich 3C+ pipelines, including contact pair manipulation, filtration, and quality control. Benchmarking pairtools against popular 3C+ data pipelines shows advantages of pairtools for high-performance and flexible 3C+ analysis. Finally, pairtools provides protocol-specific tools for multi-way contacts, haplotype-resolved contacts, and single-cell Hi-C. The combination of CLI tools and tight integration with Python data analysis libraries makes pairtools a versatile foundation for a broad range of 3C+ pipelines.
Databáze: MEDLINE