In Silico Prediction of Physicochemical Properties of Environmental Chemicals Using Molecular Fingerprints and Machine Learning

Autor: Qingda Zang, Antony J. Williams, Warren Casey, Nicole Kleinstreuer, Kamel Mansouri, Richard S. Judson, David G. Allen
Rok vydání: 2017
Předmět:
Zdroj: Journal of Chemical Information and Modeling. 57:36-49
ISSN: 1549-960X
1549-9596
Popis: There are little available toxicity data on the vast majority of chemicals in commerce. High-throughput screening (HTS) studies, such as being carried out by the U.S. Environmental Protection Agency (EPA) ToxCast program in partnership with the federal Tox21 research program, can generate biological data to inform models for predicting potential toxicity. However, physicochemical properties are also needed to model environmental fate and transport, as well as exposure potential. The purpose of the present study was to generate an open-source Quantitative Structure-Property Relationship (QSPR) workflow to predict a variety of physicochemical properties that would have cross-platform compatibility to integrate into existing cheminformatics workflows. In this effort, decades-old experimental property data sets available within EPA EPI Suite™ were reanalyzed using modern cheminformatics workflows to build updated QSPR models capable of supplying computationally efficient, open, and transparent HTS property predictions in support of environmental modeling efforts. Models were built using updated EPI Suite data sets for the prediction of six physicochemical properties: octanol-water partition coefficient (log P), water solubility (log S), boiling point (BP), melting point (MP), vapor pressure (log VP) and bioconcentration factor (log BCF). The coefficient of determination (R2) between the estimated values and experimental data for the six predicted properties ranged from 0.826 (MP) to 0.965 (BP), with model performance for five of the six properties exceeding those from the original EPI Suite™ models. The newly derived models can be employed for rapid estimation of physicochemical properties within an open-source HTS workflow to inform fate and toxicity prediction models of environmental chemicals.
Databáze: OpenAIRE