Autor: |
Kim, QHwan, Ko, Joon-Hyuk, Kim, Sunghoon, Park, Nojun, Jhe, Wonho |
Rok vydání: |
2020 |
Předmět: |
|
Druh dokumentu: |
Working Paper |
Popis: |
The characterization of drug-protein interactions is crucial in the high-throughput screening for drug discovery. The deep learning-based approaches have attracted attention because they can predict drug-protein interactions without trial-and-error by humans. However, because data labeling requires significant resources, the available protein data size is relatively small, which consequently decreases model performance. Here we propose two methods to construct a deep learning framework that exhibits superior performance with a small labeled dataset. At first, we use transfer learning in encoding protein sequences with a pretrained model, which trains general sequence representations in an unsupervised manner. Second, we use a Bayesian neural network to make a robust model by estimating the data uncertainty. As a result, our model performs better than the previous baselines for predicting drug-protein interactions. We also show that the quantified uncertainty from the Bayesian inference is related to the confidence and can be used for screening DPI data points. |
Databáze: |
arXiv |
Externí odkaz: |
|