TransAC4C-a novel interpretable architecture for multi-species identification of N4-acetylcytidine sites in RNA with single-base resolution.
Autor: | Liu R; Department of Urology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430000, China., Zhang Y; Department of Urology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430000, China., Wang Q; Department of Urology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430000, China., Zhang X; Department of Urology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430000, China.; Shenzhen Huazhong University of Science and Technology Research Institute, Shenzhen, 518000, China. |
---|---|
Jazyk: | angličtina |
Zdroj: | Briefings in bioinformatics [Brief Bioinform] 2024 Mar 27; Vol. 25 (3). |
DOI: | 10.1093/bib/bbae200 |
Abstrakt: | N4-acetylcytidine (ac4C) is a modification found in ribonucleic acid (RNA) related to diseases. Expensive and labor-intensive methods hindered the exploration of ac4C mechanisms and the development of specific anti-ac4C drugs. Therefore, an advanced prediction model for ac4C in RNA is urgently needed. Despite the construction of various prediction models, several limitations exist: (1) insufficient resolution at base level for ac4C sites; (2) lack of information on species other than Homo sapiens; (3) lack of information on RNA other than mRNA; and (4) lack of interpretation for each prediction. In light of these limitations, we have reconstructed the previous benchmark dataset and introduced a new dataset including balanced RNA sequences from multiple species and RNA types, while also providing base-level resolution for ac4C sites. Additionally, we have proposed a novel transformer-based architecture and pipeline for predicting ac4C sites, allowing for highly accurate predictions, visually interpretable results and no restrictions on the length of input RNA sequences. Statistically, our work has improved the accuracy of predicting specific ac4C sites in multiple species from less than 40% to around 85%, achieving a high AUC > 0.9. These results significantly surpass the performance of all existing models. (© The Author(s) 2024. Published by Oxford University Press.) |
Databáze: | MEDLINE |
Externí odkaz: | |
Nepřihlášeným uživatelům se plný text nezobrazuje | K zobrazení výsledku je třeba se přihlásit. |