Construction of a Virtual Opioid Bioprofile: A Data-Driven QSAR Modeling Study to Identify New Analgesic Opioids
Autor: | Morgan H. James, Daniel P. Russo, Hao Zhu, Heather L. Ciallella, Xuelian Jia, Linlin Zhao |
---|---|
Rok vydání: | 2021 |
Předmět: |
Quantitative structure–activity relationship
Computer science General Chemical Engineering Big data 02 engineering and technology 010402 general chemistry Machine learning computer.software_genre 01 natural sciences Article Data-driven Environmental Chemistry Computer Aided Design Renewable Energy Sustainability and the Environment business.industry General Chemistry 021001 nanoscience & nanotechnology 0104 chemical sciences Data set Workflow Artificial intelligence 0210 nano-technology business computer DrugBank PubChem |
Zdroj: | ACS Sustain Chem Eng |
ISSN: | 2168-0485 |
DOI: | 10.1021/acssuschemeng.0c09139 |
Popis: | Compared to traditional experimental approaches, computational modeling is a promising strategy to efficiently prioritize new candidates with low cost. In this study, we developed a novel data mining and computational modeling workflow proven to be applicable by screening new analgesic opioids. To this end, a large opioid data set was used as the probe to automatically obtain bioassay data from the PubChem portal. There were 114 PubChem bioassays selected to build quantitative structure–activity relationship (QSAR) models based on the testing results across the probe compounds. The compounds tested in each bioassay were used to develop 12 models using the combination of three machine learning approaches and four types of chemical descriptors. The model performance was evaluated by the coefficient of determination (R(2)) obtained from 5-fold cross-validation. In total, 49 models developed for 14 bioassays were selected based on the criteria and were identified to be mainly associated with binding affinities to different opioid receptors. The models for these 14 bioassays were further used to fill data gaps in the probe opioids data set and to predict general drug compounds in the DrugBank data set. This study provides a universal modeling strategy that can take advantage of large public data sets for computer-aided drug design (CADD). |
Databáze: | OpenAIRE |
Externí odkaz: |