Dielectric Polymer Property Prediction Using Recurrent Neural Networks with Optimizations.

Autor: Nazarova AL; Department of Chemistry, Loker Hydrocarbon Research Institute, and USC Bridge Institue, University of Southern California, Los Angeles, California 90089, United States., Yang L; Collaboratory of Advanced Computing and Simulations, Department of Computer Science, Department of Physics & Astronomy, Department of Chemical Engineering & Materials Science, and Department of Biological Sciences, University of Southern California, Los Angeles, California 90089, United States., Liu K; Collaboratory of Advanced Computing and Simulations, Department of Computer Science, Department of Physics & Astronomy, Department of Chemical Engineering & Materials Science, and Department of Biological Sciences, University of Southern California, Los Angeles, California 90089, United States., Mishra A; Collaboratory of Advanced Computing and Simulations, Department of Computer Science, Department of Physics & Astronomy, Department of Chemical Engineering & Materials Science, and Department of Biological Sciences, University of Southern California, Los Angeles, California 90089, United States., Kalia RK; Collaboratory of Advanced Computing and Simulations, Department of Computer Science, Department of Physics & Astronomy, Department of Chemical Engineering & Materials Science, and Department of Biological Sciences, University of Southern California, Los Angeles, California 90089, United States., Nomura KI; Collaboratory of Advanced Computing and Simulations, Department of Computer Science, Department of Physics & Astronomy, Department of Chemical Engineering & Materials Science, and Department of Biological Sciences, University of Southern California, Los Angeles, California 90089, United States., Nakano A; Collaboratory of Advanced Computing and Simulations, Department of Computer Science, Department of Physics & Astronomy, Department of Chemical Engineering & Materials Science, and Department of Biological Sciences, University of Southern California, Los Angeles, California 90089, United States., Vashishta P; Collaboratory of Advanced Computing and Simulations, Department of Computer Science, Department of Physics & Astronomy, Department of Chemical Engineering & Materials Science, and Department of Biological Sciences, University of Southern California, Los Angeles, California 90089, United States., Rajak P; Argonne Leadership Computing Facility, Argonne National Laboratory, Lemont, Illinois 60439, United States.
Jazyk: angličtina
Zdroj: Journal of chemical information and modeling [J Chem Inf Model] 2021 May 24; Vol. 61 (5), pp. 2175-2186. Date of Electronic Publication: 2021 Apr 19.
DOI: 10.1021/acs.jcim.0c01366
Abstrakt: Despite the growing success of machine learning for predicting structure-property relationships in molecules and materials, such as predicting the dielectric properties of polymers, it is still in its infancy. We report on the effectiveness of solving structure-property relationships for a computer-generated database of dielectric polymers using recurrent neural network (RNN) models. The implementation of a series of optimization strategies was crucial to achieving high learning speeds and sufficient accuracy: (1) binary and nonbinary representations of SMILES (Simplified Molecular Input Line System) fingerprints and (2) backpropagation with affine transformation of the input sequence (ATransformedBP) and resilient backpropagation with initial weight update parameter optimizations (iRPROP - optimized). For the investigated database of polymers, the binary SMILES representation was found to be superior to the decimal representation with respect to the training and prediction performance. All developed and optimized Elman-type RNN algorithms outperformed nonoptimized RNN models in the efficient prediction of nonlinear structure-activity relationships. The average relative standard deviation (RSD) remained well below 5%, and the maximum RSD did not exceed 30%. Moreover, we provide a C++ codebase as a testbed for a new generation of open programming languages that target increasingly diverse computer architectures.
Databáze: MEDLINE