Bayesian neural network with pretrained protein embedding enhances prediction accuracy of drug-protein interaction

Autor: Kim, QHwan, Ko, Joon-Hyuk, Kim, Sunghoon, Park, Nojun, Jhe, Wonho
Rok vydání: 2020
Předmět:
Druh dokumentu: Working Paper
Popis: The characterization of drug-protein interactions is crucial in the high-throughput screening for drug discovery. The deep learning-based approaches have attracted attention because they can predict drug-protein interactions without trial-and-error by humans. However, because data labeling requires significant resources, the available protein data size is relatively small, which consequently decreases model performance. Here we propose two methods to construct a deep learning framework that exhibits superior performance with a small labeled dataset. At first, we use transfer learning in encoding protein sequences with a pretrained model, which trains general sequence representations in an unsupervised manner. Second, we use a Bayesian neural network to make a robust model by estimating the data uncertainty. As a result, our model performs better than the previous baselines for predicting drug-protein interactions. We also show that the quantified uncertainty from the Bayesian inference is related to the confidence and can be used for screening DPI data points.
Databáze: arXiv