Systematic Modeling of log D7.4 Based on Ensemble Machine Learning, Group Contribution, and Matched Molecular Pair Analysis

Autor:	Aiping Lu, Tingjun Hou, Lu Liu, Li Fu, Pan Li, Junjie Ding, Yong-Huan Yun, Dong-Sheng Cao, Zhi-Jiang Yang
Rok vydání:	2019
Předmět:	Quantitative structure–activity relationship 010304 chemical physics Computer science Generalization business.industry General Chemical Engineering Pattern recognition Feature selection General Chemistry Library and Information Sciences 01 natural sciences Ensemble learning 0104 chemical sciences Computer Science Applications 010404 medicinal & biomolecular chemistry Robustness (computer science) Molecular descriptor 0103 physical sciences Artificial intelligence Matched molecular pair analysis business Applicability domain
Zdroj:	Journal of Chemical Information and Modeling. 60:63-76
ISSN:	1549-960X 1549-9596
Popis:	Lipophilicity, as evaluated by the n-octanol/buffer solution distribution coefficient at pH = 7.4 (log D7.4), is a major determinant of various absorption, distribution, metabolism, elimination, and toxicology (ADMET) parameters of drug candidates. In this study, we developed several quantitative structure–property relationship (QSPR) models to predict log D7.4 based on a large and structurally diverse data set. Eight popular machine learning algorithms were employed to build the prediction models with 43 molecular descriptors selected by a wrapper feature selection method. The results demonstrated that XGBoost yielded better prediction performance than any other single model (RT2 = 0.906 and RMSET = 0.395). Moreover, the consensus model from the top three models could continue to improve the prediction performance (RT2 = 0.922 and RMSET = 0.359). The robustness, reliability, and generalization ability of the models were strictly evaluated by the Y-randomization test and applicability domain analysis. Mor...
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::8450493ed77a37fd73def6b00bac7e0d https://doi.org/10.1021/acs.jcim.9b00718 Zobrazit plný text záznamu