iThermo: A Sequence-Based Model for Identifying Thermophilic Proteins Using a Multi-Feature Fusion Strategy.

Autor: Ahmed Z; School of Life Sciences and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China., Zulfiqar H; School of Life Sciences and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China., Khan AA; School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China.; Sichuan Artificial Intelligence Research Institute, Yibin, China., Gul I; School of Life Sciences and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.; Tsinghua Shenzhen International Graduate School, Institute of Biopharmaceutical and Health Engineering, Tsinghua University, Shenzhen, China., Dao FY; School of Life Sciences and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China., Zhang ZY; School of Life Sciences and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China., Yu XL; School of Materials Science and Engineering, Hainan University, Haikou, China., Tang L; School of Life Sciences and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.
Jazyk: angličtina
Zdroj: Frontiers in microbiology [Front Microbiol] 2022 Feb 22; Vol. 13, pp. 790063. Date of Electronic Publication: 2022 Feb 22 (Print Publication: 2022).
DOI: 10.3389/fmicb.2022.790063
Abstrakt: Thermophilic proteins have important application value in biotechnology and industrial processes. The correct identification of thermophilic proteins provides important information for the application of these proteins in engineering. The identification method of thermophilic proteins based on biochemistry is laborious, time-consuming, and high cost. Therefore, there is an urgent need for a fast and accurate method to identify thermophilic proteins. Considering this urgency, we constructed a reliable benchmark dataset containing 1,368 thermophilic and 1,443 non-thermophilic proteins. A multi-layer perceptron (MLP) model based on a multi-feature fusion strategy was proposed to discriminate thermophilic proteins from non-thermophilic proteins. On independent data set, the proposed model could achieve an accuracy of 96.26%, which demonstrates that the model has a good application prospect. In order to use the model conveniently, a user-friendly software package called iThermo was established and can be freely accessed at http://lin-group.cn/server/iThermo/index.html. The high accuracy of the model and the practicability of the developed software package indicate that this study can accelerate the discovery and engineering application of thermally stable proteins.
Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
(Copyright © 2022 Ahmed, Zulfiqar, Khan, Gul, Dao, Zhang, Yu and Tang.)
Databáze: MEDLINE