Comprehensive Review and Assessment of Computational Methods for Prediction of N6-Methyladenosine Sites.

Autor: Luo, Zhengtao, Yu, Liyi, Xu, Zhaochun, Liu, Kening, Gu, Lichuan
Zdroj: Biology (2079-7737); Oct2024, Vol. 13 Issue 10, p777, 24p
Abstrakt: Simple Summary: This study provides a comprehensive review and evaluation of computational methods for the prediction of N6-methyladenosine (m6A) sites, crucial in regulating cellular functions and gene expression. Advances in high-confidence m6A site mapping have enabled the development of robust computational approaches. We assess 52 computational methods, including machine learning, deep learning, and ensemble-based techniques, using 13 benchmark datasets from nine different species. The evaluation reveals that deep learning methods generally surpass traditional scoring function-based approaches. This systematic analysis aims to guide the design and refinement of computational tools for m6A identification, facilitating rigorous method comparison and supporting future research in RNA modifications. The findings are vital in understanding m6A-dependent mRNA regulation, with implications in addressing diseases like cancer, where m6A plays a regulatory role. N6-methyladenosine (m6A) plays a crucial regulatory role in the control of cellular functions and gene expression. Recent advances in sequencing techniques for transcriptome-wide m6A mapping have accelerated the accumulation of m6A site information at a single-nucleotide level, providing more high-confidence training data to develop computational approaches for m6A site prediction. However, it is still a major challenge to precisely predict m6A sites using in silico approaches. To advance the computational support for m6A site identification, here, we curated 13 up-to-date benchmark datasets from nine different species (i.e., H. sapiens, M. musculus, Rat, S. cerevisiae, Zebrafish, A. thaliana, Pig, Rhesus, and Chimpanzee). This will assist the research community in conducting an unbiased evaluation of alternative approaches and support future research on m6A modification. We revisited 52 computational approaches published since 2015 for m6A site identification, including 30 traditional machine learning-based, 14 deep learning-based, and 8 ensemble learning-based methods. We comprehensively reviewed these computational approaches in terms of their training datasets, calculated features, computational methodologies, performance evaluation strategy, and webserver/software usability. Using these benchmark datasets, we benchmarked nine predictors with available online websites or stand-alone software and assessed their prediction performance. We found that deep learning and traditional machine learning approaches generally outperformed scoring function-based approaches. In summary, the curated benchmark dataset repository and the systematic assessment in this study serve to inform the design and implementation of state-of-the-art computational approaches for m6A identification and facilitate more rigorous comparisons of new methods in the future. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index