A combinatorial machine-learning-driven approach for predicting glass transition temperature based on numerous molecular descriptors.

Autor: Li, Dazi, Dong, Caibo, Chen, Zhudan, Dong, Yining, Liu, Jun
Předmět:
Zdroj: Molecular Simulation; Mar/Apr2023, Vol. 49 Issue 6, p617-627, 11p
Abstrakt: Glass transition temperature (Tg) is one of the most significant thermophysical property which is hard to measure experimentally. With the development of machine learning, many molecular presentation methods and prediction algorithms have been proposed for predicting Tg. However, most descriptors of these algorithms are linear, while the values of molecular descriptors obtained from experiments or simulation software may have nonlinear relationships. General nonlinear machine learning methods are difficult to apply for the super high dimensional characteristics of molecular descriptors. In this paper, molecular descriptors are defined in manifold space and manifold learning is used for dimension reduction. Different from other linear algorithms for selecting molecular descriptors, Locally Linear Embedding (LLE) is utilised to reduce dimensionalities by extracting new features from numerous molecular descriptors. The features with dimensionality reduction are fed to the eXtreme Gradient Boosting (XGBoost) model. This ensemble algorithm is suitable for predicting glass transition temperature due to its high adaptability and nonlinear approximation ability. Genetic algorithm (GA) is adopted for optimising the parameters of XGBoost. The combinatorial approach consists of dimensionality reduction, prediction and optimisation. Experimental results show that the proposed machine learning approach LLE-XGBoost-GA has higher accuracy in Tg prediction of polymers. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index