Graph Convolutional Networks for Improved Prediction and Interpretability of Chromatographic Retention Data

Autor:	Kyriakos Efthymiadis, Peter Van Broeck, Robbin Bouwmeester, Alexander Kensert, Gert Desmet, Deirdre Cabooter
Přispěvatelé:	Chemical Engineering and Industrial Chemistry, Faculty of Engineering, Artificial Intelligence, Informatics and Applied Informatics, Department of Bio-engineering Sciences, Industrial Microbiology, Chemical Engineering and Separation Science, Centre for Molecular Separation Science & Technology
Rok vydání:	2021
Předmět:	Chromatography Reverse-Phase Chromatography Chemistry Significant difference Mean absolute error interaction liquid chromatography Molecular machine Separation Analytical Chemistry Set (abstract data type) Machine Learning Graph (abstract data type) Representation (mathematics) Hydrophobic and Hydrophilic Interactions Algorithms Hydrophilic-Interaction Chromatography Interpretability Chromatography Liquid
Zdroj:	Analytical chemistry. 93(47)
ISSN:	1520-6882
Popis:	Machine learning is a popular technique to predict the retention times of molecules based on descriptors. Descriptors and associated labels (e.g., retention times) of a set of molecules can be used to train a machine learning algorithm. However, descriptors are fixed molecular features which are not necessarily optimized for the given machine learning problem (e.g., to predict retention times). Recent advances in molecular machine learning make use of so-called graph convolutional networks (GCNs) to learn molecular representations from atoms and their bonds to adjacent atoms to optimize the molecular representation for the given problem. In this study, two GCNs were implemented to predict the retention times of molecules for three different chromatographic data sets and compared to seven benchmarks (including two state-of-the art machine learning models). Additionally, saliency maps were computed from trained GCNs to better interpret the importance of certain molecular sub-structures in the data sets. Based on the overall observations of this study, the GCNs performed better than all benchmarks, either significantly outperforming them (5-25% lower mean absolute error) or performing similar to them (
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::8f2c713fd402f58f7f6b1ec22d142b3b https://pubmed.ncbi.nlm.nih.gov/34780168 Zobrazit plný text záznamu