Crystal Site Feature Embedding Enables Exploration of Large Chemical Spaces

Autor: Kevin Ryczko, Kyle Mills, Edward H. Sargent, Isaac Tamblyn, Hitarth Choubisa, Mikhail Askerka, Oleksandr Voznyy
Rok vydání: 2020
Předmět:
Zdroj: Matter. 3:433-448
ISSN: 2590-2385
Popis: Mapping materials science problems onto computational frameworks suitable for machine learning can accelerate materials discovery. Combining proposed crystal site feature embedding (CSFE) representation with convolutional and extensive deep neural networks, we achieve a low mean absolute test error of 3.7 meV/atom and 0.069 eV on density functional theory energies and band gaps of mixed halide perovskites. We explore how a small amount of cadmium doping can potentially be applied in solar cell design and sample the large chemical space by using a variational autoencoder to discover interesting perovskites with band gaps in the ultraviolet and infrared. Additionally, we use CSFE to explore chemical spaces and small doping concentrations beyond those used for training. We further show that CSFE has a mean absolute test error of 7 meV/atom and 0.13 eV for total energies and band gaps for 2D perovskites and discuss its adaptability for exploration of an even wider variety of chemical systems. Density functional theory (DFT) is of interest in modern-day materials discovery. However, DFT is computationally expensive. Here, we develop a new crystal site feature embedding (CSFE) representation that achieves low error in predicting DFT properties and enables predicting properties of chemical families and doping fractions beyond those present in the training datasets. Using CSFE with autoencoders, we present a scheme that enables sampling of large chemical spaces and offers insight into key semiconductor parameters such as band gap. We demonstrate that CSFE works on both 2D and 3D perovskites and identify promising ultraviolet and infrared candidate materials. Here, we report crystal site feature embedding (CSFE), a representation for machine learning of materials that achieves low mean absolute errors for density functional theory band gaps and formation energies. Using CSFE with CNNs and EDNNs, we explored chemical families and doping fractions beyond those present in the training dataset. CSFE allowed us to sample large chemical spaces for materials of interest using autoencoders. We demonstrate the application of the representation by finding perovskite compositions for the ultraviolet and infrared.
Databáze: OpenAIRE