A general model for predicting enzyme functions based on enzymatic reactions

Autor: Wenjia Qian, Xiaorui Wang, Yu Kang, Peichen Pan, Tingjun Hou, Chang-Yu Hsieh
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: Journal of Cheminformatics, Vol 16, Iss 1, Pp 1-16 (2024)
Druh dokumentu: article
ISSN: 1758-2946
DOI: 10.1186/s13321-024-00827-y
Popis: Abstract Accurate prediction of the enzyme comission (EC) numbers for chemical reactions is essential for the understanding and manipulation of enzyme functions, biocatalytic processes and biosynthetic planning. A number of machine leanring (ML)-based models have been developed to classify enzymatic reactions, showing great advantages over costly and long-winded experimental verifications. However, the prediction accuracy for most available models trained on the records of chemical reactions without specifying the enzymatic catalysts is rather limited. In this study, we introduced BEC-Pred, a BERT-based multiclassification model, for predicting EC numbers associated with reactions. Leveraging transfer learning, our approach achieves precise forecasting across a wide variety of Enzyme Commission (EC) numbers solely through analysis of the SMILES sequences of substrates and products. BEC-Pred model outperformed other sequence and graph-based ML methods, attaining a higher accuracy of 91.6%, surpassing them by 5.5%, and exhibiting superior F1 scores with improvements of 6.6% and 6.0%, respectively. The enhanced performance highlights the potential of BEC-Pred to serve as a reliable foundational tool to accelerate the cutting-edge research in synthetic biology and drug metabolism. Moreover, we discussed a few examples on how BEC-Pred could accurately predict the enzymatic classification for the Novozym 435-induced hydrolysis and lipase efficient catalytic synthesis. We anticipate that BEC-Pred will have a positive impact on the progression of enzymatic research.
Databáze: Directory of Open Access Journals
Nepřihlášeným uživatelům se plný text nezobrazuje