In Silico Prediction of Physicochemical Properties of Environmental Chemicals Using Molecular Fingerprints and Machine Learning
Autor: | Qingda Zang, Antony J. Williams, Warren Casey, Nicole Kleinstreuer, Kamel Mansouri, Richard S. Judson, David G. Allen |
---|---|
Rok vydání: | 2017 |
Předmět: |
0301 basic medicine
Quantitative structure–activity relationship Informatics Chemical Phenomena Vapor Pressure Computer science General Chemical Engineering In silico Quantitative Structure-Activity Relationship Library and Information Sciences Machine learning computer.software_genre Article Machine Learning 03 medical and health sciences Transition Temperature Computer Simulation Biological data Toxicity data business.industry Water General Chemistry Computer Science Applications 030104 developmental biology Workflow Solubility Cheminformatics Environmental Pollutants Artificial intelligence business computer Potential toxicity |
Zdroj: | Journal of Chemical Information and Modeling. 57:36-49 |
ISSN: | 1549-960X 1549-9596 |
Popis: | There are little available toxicity data on the vast majority of chemicals in commerce. High-throughput screening (HTS) studies, such as being carried out by the U.S. Environmental Protection Agency (EPA) ToxCast program in partnership with the federal Tox21 research program, can generate biological data to inform models for predicting potential toxicity. However, physicochemical properties are also needed to model environmental fate and transport, as well as exposure potential. The purpose of the present study was to generate an open-source Quantitative Structure-Property Relationship (QSPR) workflow to predict a variety of physicochemical properties that would have cross-platform compatibility to integrate into existing cheminformatics workflows. In this effort, decades-old experimental property data sets available within EPA EPI Suite™ were reanalyzed using modern cheminformatics workflows to build updated QSPR models capable of supplying computationally efficient, open, and transparent HTS property predictions in support of environmental modeling efforts. Models were built using updated EPI Suite data sets for the prediction of six physicochemical properties: octanol-water partition coefficient (log P), water solubility (log S), boiling point (BP), melting point (MP), vapor pressure (log VP) and bioconcentration factor (log BCF). The coefficient of determination (R2) between the estimated values and experimental data for the six predicted properties ranged from 0.826 (MP) to 0.965 (BP), with model performance for five of the six properties exceeding those from the original EPI Suite™ models. The newly derived models can be employed for rapid estimation of physicochemical properties within an open-source HTS workflow to inform fate and toxicity prediction models of environmental chemicals. |
Databáze: | OpenAIRE |
Externí odkaz: |