Recognizing binding sites of poorly characterized RNA-binding proteins on circular RNAs using attention Siamese network
Autor: | Hehe Wu, Xiaoyong Pan, Yang Yang, Hong-Bin Shen |
---|---|
Rok vydání: | 2021 |
Předmět: |
Source code
Computer science media_common.quotation_subject RNA-binding protein Computational biology 03 medical and health sciences Binding site Molecular Biology 030304 developmental biology media_common 0303 health sciences Binding Sites Training set Artificial neural network business.industry Deep learning 030302 biochemistry & molecular biology Computational Biology RNA-Binding Proteins RNA Circular Metric (mathematics) Labeled data Neural Networks Computer Artificial intelligence business Information Systems |
Zdroj: | Briefings in Bioinformatics. 22 |
ISSN: | 1477-4054 1467-5463 |
DOI: | 10.1093/bib/bbab279 |
Popis: | Circular RNAs (circRNAs) interact with RNA-binding proteins (RBPs) to play crucial roles in gene regulation and disease development. Computational approaches have attracted much attention to quickly predict highly potential RBP binding sites on circRNAs using the sequence or structure statistical binding knowledge. Deep learning is one of the popular learning models in this area but usually requires a lot of labeled training data. It would perform unsatisfactorily for the less characterized RBPs with a limited number of known target circRNAs. How to improve the prediction performance for such small-size labeled characterized RBPs is a challenging task for deep learning–based models. In this study, we propose an RBP-specific method iDeepC for predicting RBP binding sites on circRNAs from sequences. It adopts a Siamese neural network consisting of a lightweight attention module and a metric module. We have found that Siamese neural network effectively enhances the network capability of capturing mutual information between circRNAs with pairwise metric learning. To further deal with the small-sample size problem, we have performed the pretraining using available labeled data from other RBPs and also demonstrate the efficacy of this transfer-learning pipeline. We comprehensively evaluated iDeepC on the benchmark datasets of RBP-binding circRNAs, and the results suggest iDeepC achieving promising results on the poorly characterized RBPs. The source code is available at https://github.com/hehew321/iDeepC. |
Databáze: | OpenAIRE |
Externí odkaz: |