PseUpred-ELPSO Is an Ensemble Learning Predictor with Particle Swarm Optimizer for Improving the Prediction of RNA Pseudouridine Sites

Autor: Xiao Wang, Pengfei Li, Rong Wang, Xu Gao
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: Biology, Vol 13, Iss 4, p 248 (2024)
Druh dokumentu: article
ISSN: 2079-7737
DOI: 10.3390/biology13040248
Popis: RNA pseudouridine modification exists in different RNA types of many species, and it has a significant role in regulating the expression of biological processes. To understand the functional mechanisms for RNA pseudouridine sites, the accurate identification of pseudouridine sites in RNA sequences is essential. Although several fast and inexpensive computational methods have been proposed, the challenge of improving recognition accuracy and generalization still exists. This study proposed a novel ensemble predictor called PseUpred-ELPSO for improved RNA pseudouridine site prediction. After analyzing the nucleotide composition preferences between RNA pseudouridine site sequences, two feature representations were determined and fed into the stacking ensemble framework. Then, using five tree-based machine learning classifiers as base classifiers, 30-dimensional RNA profiles are constructed to represent RNA sequences, and using the PSO algorithm, the weights of the RNA profiles were searched to further enhance the representation. A logistic regression classifier was used as a meta-classifier to complete the final predictions. Compared to the most advanced predictors, the performance of PseUpred-ELPSO is superior in both cross-validation and the independent test. Based on the PseUpred-ELPSO predictor, a free and easy-to-operate web server has been established, which will be a powerful tool for pseudouridine site identification.
Databáze: Directory of Open Access Journals