Unsupervised Machine Learning for Unbiased Chemical Classification in X-ray Absorption Spectroscopy and X-ray Emission Spectroscopy
Autor: | Gerald T. Seidler, Niranjan Govind, Samantha Tetef |
---|---|
Rok vydání: | 2021 |
Předmět: |
X-ray absorption spectroscopy
Materials science Computer science business.industry General Physics and Astronomy Pattern recognition Electronic structure Chemical classification Autoencoder XANES chemistry.chemical_compound chemistry Principal component analysis Embedding Unsupervised learning Molecule Artificial intelligence Emission spectrum Physical and Theoretical Chemistry business Biological system |
DOI: | 10.26434/chemrxiv-2021-5tvrv |
Popis: | We report a comprehensive computational study of unsupervised machine learning for extraction of chemically relevant information in X-ray absorption near edge structure (XANES) and in valence-to-core X-ray emission spectra (VtC-XES) for classification of a broad ensemble of sulphorganic molecules. By progressively decreasing the constraining assumptions of the unsupervised machine learning algorithm, moving from principal component analysis (PCA) to a variational autoencoder (VAE) to t-distributed stochastic neighbour embedding (t-SNE), we find improved sensitivity to steadily more refined chemical information. Surprisingly, when embedding the ensemble of spectra in merely two dimensions, t-SNE distinguishes not just oxidation state and general sulphur bonding environment but also the aromaticity of the bonding radical group with 87% accuracy as well as identifying even finer details in electronic structure within aromatic or aliphatic sub-classes. We find that the chemical information in XANES and VtC-XES is very similar in character and content, although they unexpectedly have different sensitivity within a given molecular class. We also discuss likely benefits from further effort with unsupervised machine learning and from the interplay between supervised and unsupervised machine learning for X-ray spectroscopies. Our overall results, i.e., the ability to reliably classify without user bias and to discover unexpected chemical signatures for XANES and VtC-XES, likely generalize to other systems as well as to other one-dimensional chemical spectroscopies. |
Databáze: | OpenAIRE |
Externí odkaz: |