Prediction of Protein pKa with Representation Learning

Autor:	Hatice Gökcan, Olexandr Isayev
Rok vydání:	2022
Předmět:	Molecular interactions Artificial neural network Null model Test set A protein Biological system Representation (mathematics) Feature learning Quantum Mathematics
DOI:	10.33774/chemrxiv-2021-tcn0f-v2
Popis:	The behavior of proteins is closely related to the protonation states of the residues. Therefore, prediction and measurement of pKa are essential to understand the basic functions of proteins. In this work, we develop a new empirical scheme for protein pKa prediction that is based on deep representation learning. It combines machine learning with atomic environment vector (AEV) and learned quantum mechanical representation from ANI-2x neural network potential (J. Chem. Theory Comput. 2020, 16, 4192). The scheme requires only the coordinate information of a protein as the input and separately estimates the pKa for all five titratable amino acid types. The accuracy of the approach was analyzed with both cross-validation and an external test set of proteins. Obtained results were compared with the widely used empirical approach PROPKA. The new empirical model provides accuracy with MAEs below 0.5 for all amino acid types. It surpasses the accuracy of PROPKA and performs significantly better than the null model. Our model is also sensitive to the local conformational changes and molecular interactions.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::7c9799eda74469eeca9ea5580dbfaaca https://doi.org/10.33774/chemrxiv-2021-tcn0f-v2 Zobrazit plný text záznamu