Predmoter-cross-species prediction of plant promoter and enhancer regions.

Autor: Kindel F; Institute of Plant Biochemistry, Math.-Nat. Faculty, Heinrich Heine University, Düsseldorf 40225, Germany., Triesch S; Institute of Plant Biochemistry, Math.-Nat. Faculty, Heinrich Heine University, Düsseldorf 40225, Germany.; Cluster of Excellence on Plant Sciences (CEPLAS), Germany., Schlüter U; Institute of Plant Biochemistry, Math.-Nat. Faculty, Heinrich Heine University, Düsseldorf 40225, Germany., Randarevitch LA; Cluster of Excellence on Plant Sciences (CEPLAS), Germany.; Institute of Population Genetics, Math.-Nat. Faculty, Heinrich Heine University, Düsseldorf 40225, Germany., Reichel-Deland V; Institute of Plant Biochemistry, Math.-Nat. Faculty, Heinrich Heine University, Düsseldorf 40225, Germany., Weber APM; Institute of Plant Biochemistry, Math.-Nat. Faculty, Heinrich Heine University, Düsseldorf 40225, Germany.; Cluster of Excellence on Plant Sciences (CEPLAS), Germany., Denton AK; Institute of Plant Biochemistry, Math.-Nat. Faculty, Heinrich Heine University, Düsseldorf 40225, Germany.; Cluster of Excellence on Plant Sciences (CEPLAS), Germany.; Valence Labs, Montréal, Québec H2S 3H1, Canada.
Jazyk: angličtina
Zdroj: Bioinformatics advances [Bioinform Adv] 2024 May 24; Vol. 4 (1), pp. vbae074. Date of Electronic Publication: 2024 May 24 (Print Publication: 2024).
DOI: 10.1093/bioadv/vbae074
Abstrakt: Motivation: Identifying cis -regulatory elements (CREs) is crucial for analyzing gene regulatory networks. Next generation sequencing methods were developed to identify CREs but represent a considerable expenditure for targeted analysis of few genomic loci. Thus, predicting the outputs of these methods would significantly cut costs and time investment.
Results: We present Predmoter, a deep neural network that predicts base-wise Assay for Transposase Accessible Chromatin using sequencing (ATAC-seq) and histone Chromatin immunoprecipitation DNA-sequencing (ChIP-seq) read coverage for plant genomes. Predmoter uses only the DNA sequence as input. We trained our final model on 21 species for 13 of which ATAC-seq data and for 17 of which ChIP-seq data was publicly available. We evaluated our models on Arabidopsis thaliana and Oryza sativa . Our best models showed accurate predictions in peak position and pattern for ATAC- and histone ChIP-seq. Annotating putatively accessible chromatin regions provides valuable input for the identification of CREs. In conjunction with other in silico data, this can significantly reduce the search space for experimentally verifiable DNA-protein interaction pairs.
Availability and Implementation: The source code for Predmoter is available at: https://github.com/weberlab-hhu/Predmoter. Predmoter takes a fasta file as input and outputs h5, and optionally bigWig and bedGraph files.
Competing Interests: AKD is now a current employee at Valence Labs, part of Recursion Pharmaceuticals, Inc. and has received real ownership interest in the company.
(© The Author(s) 2024. Published by Oxford University Press.)
Databáze: MEDLINE