Estimation of pKa for Druglike Compounds Using Semiempirical and Information-Based Descriptors
Autor: | Paul Selzer, Peter Ertl, Stephen Jelfs |
---|---|
Rok vydání: | 2007 |
Předmět: |
Pyridines
General Chemical Engineering Carboxylic Acids Library and Information Sciences Computational chemistry Ionization Physics::Atomic and Molecular Clusters Physics::Atomic Physics Amines Physics::Chemical Physics Ions Internet Aniline Compounds Aqueous solution Molecular Structure Chemistry Computational Biology General Medicine General Chemistry Hydrogen-Ion Concentration Computer Science Applications Pyrimidines Models Chemical Pharmaceutical Preparations Alcohols Imines |
Zdroj: | Journal of Chemical Information and Modeling. 47:450-459 |
ISSN: | 1549-960X 1549-9596 |
Popis: | A pragmatic approach has been developed for the estimation of aqueous ionization constants (pKa) for druglike compounds. The method involves an algorithm that assigns ionization constants in a stepwise manner to the acidic and basic groups present in a compound. Predictions are made for each ionizable group using models derived from semiempirical quantum chemical properties and information-based descriptors. Semiempirical properties include the partial charge and electrophilic superdelocalizabilty of the atom(s) undergoing protonation or deprotonation. Importantly, the latter property has been extended to allow predictions to be made for multiprotic compounds, overcoming limitations of a previous approach described by Tehan et al. The information-based descriptions include molecular-tree structured fingerprints, based on the methodology outlined by Xing et al., with the addition of 2D substructure flags indicating the presence of other important structural features. These two classes of descriptor were found to complement one another particularly well, resulting in predictive models for a range of functional groups (including alcohols, amidines, amines, anilines, carboxylic acids, guanidines, imidazoles, imines, phenols, pyridines, and pyrimidines). A combined RMSE of 0.48 and 0.81 was obtained for the training set and an external test set compounds, respectively. The predictive models were based on compounds selected from the commercially available BioLoom database. The resultant speed and accuracy of the approach has also enabled the development of Web application on the Novartis intranet for pKa prediction. |
Databáze: | OpenAIRE |
Externí odkaz: |