Prediction of chemical biodegradability using computational methods

Autor: Zhixiong Zhan, Youyong Li, Xuechu Zhen, Sheng Tian, Linlang Li
Rok vydání: 2017
Předmět:
DOI: 10.6084/m9.figshare.5100445.v1
Popis: Biodegradability is a key factor to describe the long-time effects of chemicals to be decomposed in the environment. Compared with time-consuming and laborious experimental testing, the use of in silico approaches for assessing chemical biodegradability is highly encouraged by the legislators. In this study, based on an extensive data-set with 547 ready biodegradation (RB) and 1178 non-ready biodegradation (NRB) chemicals, we first examined the differences of the important physico-chemical properties and scaffold architectures between the RB and NRB molecules. We found that compared with the NRB molecules, the RB molecules are usually smaller, more flexible and hydrophilic, and have less polar groups and more complicated structural patterns (ring systems). However, the RB and NRB molecules cannot be well distinguished by any simple property-based or substructure-based rules. Then, the naïve Bayesian classification (NBC) approach was employed to develop classifiers for discriminating the RB and NRB molecules. Based on the 21 physico-chemical properties, 76 VolSurf descriptors and LPFP_4 structural fingerprints, the Bayesian classifier can achieve a sensitivity of 0.877, a specificity of 0.864, a global accuracy of 0.869, a C value of 0.720 and a AUC value of 0.890 for the training set. Besides, the best predictions can be achieved for the classifiers based on the combinations of simple physico-chemical properties, VolSurf descriptors, and LPFP_6 fingerprints for the test set I (AUC = 0.921), and any of the three fingerprint classes (ECFC_6, ECFC_8 or LPFC_4) for the test set II (AUC = 0.901). In addition, 20 structural fragments favourable and unfavourable for ready biodegradation, which were directly generated from the best naive Bayesian classifier, were highlighted and discussed. The results provide useful guidelines/tools for designing promising chemical compounds with good chemical biodegradability.
Databáze: OpenAIRE