Improve-RRBS: a novel tool to correct the 3' trimming of reduced representation sequencing reads.

Autor: Fóthi Á; Institute of Molecular Life Sciences, Research Center for Natural Sciences, HUN-REN, Budapest 1117, Hungary.; Department of Molecular Biology, Semmelweis University, Budapest 1094, Hungary., Liu H; Renal Electrolyte and Hypertension Division, Department of Medicine, University of Pennsylvania, Philadelphia, PA 19104, United States.; Penn/CHOP Kidney Innovation Center, University of Pennsylvania, Philadelphia, PA 19104, United States.; Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, United States., Susztak K; Renal Electrolyte and Hypertension Division, Department of Medicine, University of Pennsylvania, Philadelphia, PA 19104, United States.; Penn/CHOP Kidney Innovation Center, University of Pennsylvania, Philadelphia, PA 19104, United States.; Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, United States., Aranyi T; Institute of Molecular Life Sciences, Research Center for Natural Sciences, HUN-REN, Budapest 1117, Hungary.; Department of Molecular Biology, Semmelweis University, Budapest 1094, Hungary.
Jazyk: angličtina
Zdroj: Bioinformatics advances [Bioinform Adv] 2024 May 24; Vol. 4 (1), pp. vbae076. Date of Electronic Publication: 2024 May 24 (Print Publication: 2024).
DOI: 10.1093/bioadv/vbae076
Abstrakt: Motivation: Reduced Representation Bisulfite Sequencing (RRBS) is a popular approach to determine DNA methylation of the CpG-rich regions of the genome. However, we observed that false positive differentially methylated sites (DMS) are also identified using the standard computational analysis.
Results: During RRBS library preparation the MspI digested DNA undergo end-repair by a cytosine at the 3' end of the fragments. After sequencing, Trim Galore cuts these end-repaired nucleotides. However, Trim Galore fails to detect end-repair when it overlaps with the 3' end of the sequencing reads. We found that these non-trimmed cytosines bias methylation calling, thus, can identify DMS erroneously. To circumvent this problem, we developed improve-RRBS, which efficiently identifies and hides these cytosines from methylation calling with a false positive rate of maximum 0.5%. To test improve-RRBS, we investigated four datasets from four laboratories and two different species. We found non-trimmed 3' cytosines in all datasets analyzed and as much as >50% of false positive DMS under certain conditions. By applying improve-RRBS, these DMS completely disappeared from all comparisons.
Availability and Implementation: Improve-RRBS is a freely available python package https://pypi.org/project/iRRBS/ or https://github.com/fothia/improve-RRBS to be implemented in RRBS pipelines.
Competing Interests: None declared.
(© The Author(s) 2024. Published by Oxford University Press.)
Databáze: MEDLINE