Prediction of Protein–ATP Binding Residues Based on Ensemble of Deep Convolutional Neural Networks and LightGBM Algorithm

Autor:	Jiazhi Song, Guixia Liu, Yanchun Liang, Jingqing Jiang, Ping Zhang
Jazyk:	angličtina
Rok vydání:	2021
Předmět:	0301 basic medicine Support Vector Machine Computer science protein–ATP binding residue prediction 02 engineering and technology Convolutional neural network Catalysis Article LightGBM Inorganic Chemistry lcsh:Chemistry Machine Learning 03 medical and health sciences Adenosine Triphosphate deep convolutional neural network Prediction methods 0202 electrical engineering electronic engineering information engineering Humans Amino Acid Sequence Physical and Theoretical Chemistry Molecular Biology lcsh:QH301-705.5 Spectroscopy Organic Chemistry Computational Biology Proteins General Medicine Matthews correlation coefficient Ensemble learning Computer Science Applications Random forest Support vector machine 030104 developmental biology lcsh:Biology (General) lcsh:QD1-999 Weight distribution Benchmark (computing) ensemble learning 020201 artificial intelligence & image processing Neural Networks Computer protein primary sequence Carrier Proteins Algorithm Algorithms Protein Binding
Zdroj:	International Journal of Molecular Sciences Volume 22 Issue 2 International Journal of Molecular Sciences, Vol 22, Iss 939, p 939 (2021)
ISSN:	1422-0067
DOI:	10.3390/ijms22020939
Popis:	Accurately identifying protein&ndash ATP binding residues is important for protein function annotation and drug design. Previous studies have used classic machine-learning algorithms like support vector machine (SVM) and random forest to predict protein&ndash ATP binding residues however, as new machine-learning techniques are being developed, the prediction performance could be further improved. In this paper, an ensemble predictor that combines deep convolutional neural network and LightGBM with ensemble learning algorithm is proposed. Three subclassifiers have been developed, including a multi-incepResNet-based predictor, a multi-Xception-based predictor, and a LightGBM predictor. The final prediction result is the combination of outputs from three subclassifiers with optimized weight distribution. We examined the performance of our proposed predictor using two datasets: a classic ATP-binding benchmark dataset and a newly proposed ATP-binding dataset. Our predictor achieved area under the curve (AUC) values of 0.925 and 0.902 and Matthews Correlation Coefficient (MCC) values of 0.639 and 0.642, respectively, which are both better than other state-of-art prediction methods.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::9fe5966be2b1229c67c7a2888adb5975 Zobrazit plný text záznamu Plný text ve formátu PDF Plný text ve formátu HTML
Nepřihlášeným uživatelům se plný text nezobrazuje	K zobrazení výsledku je třeba se přihlásit.