Conformal prediction to define applicability domain – A case study on predicting ER and AR binding
Autor: | Patrik L. Andersson, Aleksandra Rybacka, Ulf Norinder |
---|---|
Rok vydání: | 2016 |
Předmět: |
0301 basic medicine
Quantitative structure–activity relationship Computer science In silico Molecular Conformation Quantitative Structure-Activity Relationship Bioengineering Conformal map Endocrine Disruptors computer.software_genre 01 natural sciences 03 medical and health sciences Statistical quality Drug Discovery Applied mathematics Computer Simulation Oestrogen receptor Estrogens General Medicine 0104 chemical sciences Random forest 010404 medicinal & biomolecular chemistry 030104 developmental biology Receptors Estrogen Receptors Androgen Androgens Molecular Medicine Data mining AR binding computer Protein Binding Applicability domain |
Zdroj: | SAR and QSAR in Environmental Research. 27:303-316 |
ISSN: | 1029-046X 1062-936X |
DOI: | 10.1080/1062936x.2016.1172665 |
Popis: | A fundamental element when deriving a robust and predictive in silico model is not only the statistical quality of the model in question but, equally important, the estimate of its predictive boundaries. This work presents a new method, conformal prediction, for applicability domain estimation in the field of endocrine disruptors. The method is applied to binders and non-binders related to the oestrogen and androgen receptors. Ensembles of decision trees are used as statistical method and three different sets (dragon, rdkit and signature fingerprints) are investigated as chemical descriptors. The conformal prediction method results in valid models where there is an excellent balance in quality between the internally validated training set and the corresponding external test set, both in terms of validity and with respect to sensitivity and specificity. With this method the level of confidence can be readily altered by the user and the consequences thereof immediately inspected. Furthermore, the predictive boundaries for the derived models are rigorously defined by using the conformal prediction framework, thus no ambiguity exists as to the level of similarity needed for new compounds to be in or out of the predictive boundaries of the derived models where reliable predictions can be expected. |
Databáze: | OpenAIRE |
Externí odkaz: |