Autor: |
Zihao Wang, Mario R. Eden, Saimeng Jin, Weifeng Shen, Yang Su, Huaqiang Wen, Jingzheng Ren |
Rok vydání: |
2021 |
Předmět: |
|
DOI: |
10.22541/au.162206662.29993062/v1 |
Popis: |
Quantitative structure-property relationship (QSPR) studies based on deep neural networks (DNN) are receiving increasing attention due to their excellent performances. A systematic methodology coupling multiple machine learning technologies is proposed to solve vital problems including applicability domain and prediction uncertainty in DNN-based QSPRs. Key features are rapidly extracted from plentiful but chaotic descriptors by principal component analysis (PCA) and kernel PCA. Then, a detailed applicability domain (AD) is defined by K-means algorithm to avoid unreliable predictions and discover its potential impact on uncertainty. Moreover, prediction uncertainty is analyzed with dropout-embedded DNN by thousands of independent tests to assess the reliability of predictions. The prediction of flashpoint temperature is employed as a case study demonstrating that the model accuracy is remarkably improved comparing with the referenced model. More importantly, the proposed methodology breaks through difficulties in analyzing the uncertainty of DNN-based QSPRs and presents an AD correlated with the uncertainty. |
Databáze: |
OpenAIRE |
Externí odkaz: |
|