Machine learning with physicochemical relationships: solubility prediction in organic solvents and water

Autor: Bao N. Nguyen, David R. J. Hose, A. John Blacker, Samuel Boobier
Jazyk: angličtina
Rok vydání: 2020
Předmět:
Zdroj: Nature Communications, Vol 11, Iss 1, Pp 1-10 (2020)
Nature Communications
ISSN: 2041-1723
Popis: Solubility prediction remains a critical challenge in drug development, synthetic route and chemical process design, extraction and crystallisation. Here we report a successful approach to solubility prediction in organic solvents and water using a combination of machine learning (ANN, SVM, RF, ExtraTrees, Bagging and GP) and computational chemistry. Rational interpretation of dissolution process into a numerical problem led to a small set of selected descriptors and subsequent predictions which are independent of the applied machine learning method. These models gave significantly more accurate predictions compared to benchmarked open-access and commercial tools, achieving accuracy close to the expected level of noise in training data (LogS ± 0.7). Finally, they reproduced physicochemical relationship between solubility and molecular properties in different solvents, which led to rational approaches to improve the accuracy of each models.
Accurate prediction of solubility represents a challenge for traditional computational approaches due to the complex nature of phenomena involved. Here the authors report a successful approach to solubility prediction in organic solvents and water using combination of machine learning and computational chemistry.
Databáze: OpenAIRE