Pairtools: From sequencing data to chromosome contacts.

Autor: Abdennur N; Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, Massachusetts, United States of America.; Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, Massachusetts, United States of America., Fudenberg G; Department of Computational and Quantitative Biology, University of Southern California, Los Angeles, California, United States of America., Flyamer IM; Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland., Galitsyna AA; Institute for Medical Engineering and Sciences, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, United States of America.; Institute of Molecular Biotechnology of the Austrian Academy of Sciences (IMBA), Vienna BioCenter (VBC), Vienna, Austria., Goloborodko A; Institute of Molecular Biotechnology of the Austrian Academy of Sciences (IMBA), Vienna BioCenter (VBC), Vienna, Austria., Imakaev M; Institute for Medical Engineering and Sciences, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, United States of America., Venev SV; Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, Massachusetts, United States of America.
Jazyk: angličtina
Zdroj: PLoS computational biology [PLoS Comput Biol] 2024 May 29; Vol. 20 (5), pp. e1012164. Date of Electronic Publication: 2024 May 29 (Print Publication: 2024).
DOI: 10.1371/journal.pcbi.1012164
Abstrakt: The field of 3D genome organization produces large amounts of sequencing data from Hi-C and a rapidly-expanding set of other chromosome conformation protocols (3C+). Massive and heterogeneous 3C+ data require high-performance and flexible processing of sequenced reads into contact pairs. To meet these challenges, we present pairtools-a flexible suite of tools for contact extraction from sequencing data. Pairtools provides modular command-line interface (CLI) tools that can be flexibly chained into data processing pipelines. The core operations provided by pairtools are parsing of.sam alignments into Hi-C pairs, sorting and removal of PCR duplicates. In addition, pairtools provides auxiliary tools for building feature-rich 3C+ pipelines, including contact pair manipulation, filtration, and quality control. Benchmarking pairtools against popular 3C+ data pipelines shows advantages of pairtools for high-performance and flexible 3C+ analysis. Finally, pairtools provides protocol-specific tools for restriction-based protocols, haplotype-resolved contacts, and single-cell Hi-C. The combination of CLI tools and tight integration with Python data analysis libraries makes pairtools a versatile foundation for a broad range of 3C+ pipelines.
Competing Interests: The authors have declared that no competing interests exist.
(Copyright: © 2024 Open2C et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.)
Databáze: MEDLINE
Nepřihlášeným uživatelům se plný text nezobrazuje