Data-Augmentation for Graph Neural Network Learning of the Relaxed Energies of Unrelaxed Structures

Autor: Gibson, Jason B., Hire, Ajinkya C., Hennig, Richard G.
Rok vydání: 2022
Předmět:
Druh dokumentu: Working Paper
Popis: Computational materials discovery has continually grown in utility over the past decade due to advances in computing power and crystal structure prediction algorithms (CSPA). However, the computational cost of the \textit{ab initio} calculations required by CSPA limits its utility to small unit cells, reducing the compositional and structural space the algorithms can explore. Past studies have bypassed many unneeded \textit{ab initio} calculations by utilizing machine learning methods to predict formation energy and determine the stability of a material. Specifically, graph neural networks display high fidelity in predicting formation energy. Traditionally graph neural networks are trained on large data sets of relaxed structures. Unfortunately, the geometries of unrelaxed candidate structures produced by CSPA often deviate from the relaxed state, which leads to poor predictions hindering the model's ability to filter energetically unfavorable prior to \textit{ab initio} evaluation. This work shows that the prediction error on relaxed structures reduces as training progresses, while the prediction error on unrelaxed structures increases, suggesting an inverse correlation between relaxed and unrelaxed structure prediction accuracy. To remedy this behavior, we propose a simple, physically motivated, computationally cheap perturbation technique that augments training data to improve predictions on unrelaxed structures dramatically. On our test set consisting of 623 Nb-Sr-H hydride structures, we found that training a crystal graph convolutional neural networks, utilizing our augmentation method, reduced the MAE of formation energy prediction by 66\% compared to training with only relaxed structures. We then show how this error reduction can accelerates CSPA by improving the model's ability to filter out energetically unfavorable structures accurately.
Comment: 8 pages, 6 figures
Databáze: arXiv