AFP-CMBPred: Computational identification of antifreeze proteins by extending consensus sequences into multi-blocks evolutionary information.
Autor: | Ali F; School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China. Electronic address: farman335@yahoo.com., Akbar S; Department of Computer Science, Abdul Wali Khan University Mardan, Pakistan., Ghulam A; Computerization and Network Section, Sindh Agriculture University, Tandojam, Pakistan., Maher ZA; Information Technology Center, Sindh Agriculture University, Tandojam, Pakistan., Unar A; School of Life Science, University of Science and Technology, China., Talpur DB; School of Information and Communication Engineering, Guilin University of Electronic Technology, Guilin, China. |
---|---|
Jazyk: | angličtina |
Zdroj: | Computers in biology and medicine [Comput Biol Med] 2021 Dec; Vol. 139, pp. 105006. Date of Electronic Publication: 2021 Nov 02. |
DOI: | 10.1016/j.compbiomed.2021.105006 |
Abstrakt: | In extremely cold environments, living organisms like plants, animals, fishes, and microbes can die due to the intracellular ice formation in their bodies. To sustain life in such cold environments, some cold-blooded species produced Antifreeze proteins (AFPs), also called ice-binding proteins. AFPs are not only limited to the medical field but also have diverse significance in the area of biotechnology, agriculture, and the food industry. Different AFPs exhibit high heterogeneity in their structures and sequences. Keeping the significance of AFPs, several machine-learning-based models have been developed by scientists for the prediction of AFPs. However, due to the complex and diverse nature of AFPs, the prediction performance of the existing methods is limited. Therefore, it is highly indispensable for researchers to develop a reliable computational model that can accurately predict AFPs. In this connection, this study presents a novel predictor for AFPs, named AFP-CMBPred. The sequences of AFPs are formulated via four different feature representation methods, such as Amphiphilic pseudo amino acid composition (Amp-PseAAC), Dipeptide Deviation from Expected Mean (DDE), Multi-Blocks Position Specific Scoring Matrix (MB-PSSM), and Consensus Sequence-based on Multi-Blocks Position Specific Scoring Matrix (CS-MB-PSSM) to collect local and global descriptors. In the next step, the extracted feature vectors are evaluated via Support Vector Machine (SVM) and Random Forest (RF) based classification learners. The prediction performance of both classifiers is further assessed using three validation methods i.e., jackknife test, 10-fold cross-validation test, and independent test. After examining the prediction rates of all validation tests, it was found that our proposed model achieved the higher prediction accuracies of ∼2.65%, ∼2.84%, and ∼3.37% using jackknife, K-fold, and independent test, respectively. The experimental outcomes validate that our proposed "AFP-CMBPred" predictor secured the highest prediction results than the existing models for the identification of AFPs. It is further anticipated that our proposed AFP-CMBPred model will be considered a valuable tool in the research academia and drug development. (Copyright © 2021 Elsevier Ltd. All rights reserved.) |
Databáze: | MEDLINE |
Externí odkaz: |