Prediction of Retention Time and Collision Cross Section (CCSH+, CCSH–, and CCSNa+) of Emerging Contaminants Using Multiple Adaptive Regression Splines

Autor: Alberto Celma, Richard Bade, Juan Vicente Sancho, Félix Hernandez, Melissa Humphries, Lubertus Bijlsma
Přispěvatelé: Celma, Alberto, Bade, Richard, Sancho, Juan Vicente, Hernandez, Félix, Humphries, Melissa, Bijlsma, Lubertus
Rok vydání: 2022
Předmět:
Zdroj: Journal of Chemical Information and Modeling. 62:5425-5434
ISSN: 1549-960X
1549-9596
DOI: 10.1021/acs.jcim.2c00847
Popis: Ultra-high performance liquid chromatography coupled to ion mobility separation and high-resolution mass spectrometry instruments have proven very valuable for screening of emerging contaminants in the aquatic environment. However, when applying suspect or nontarget approaches (i.e., when no reference standards are available), there is no information on retention time (RT) and collision cross-section (CCS) values to facilitate identification. In silico prediction tools of RT and CCS can therefore be of great utility to decrease the number of candidates to investigate. In this work, Multiple Adaptive Regression Splines (MARS) were evaluated for the prediction of both RT and CCS. MARS prediction models were developed and validated using a database of 477 protonated molecules, 169 deprotonated molecules, and 249 sodium adducts. Multivariate and univariate models were evaluated showing a better fit for univariate models to the experimental data. The RT model (R2 = 0.855) showed a deviation between predicted and experimental data of ±2.32 min (95% confidence intervals). The deviation observed for CCS data of protonated molecules using the CCSₕ model (R² = 0.966) was ±4.05% with 95% confidence intervals. The CCSₕ model was also tested for the prediction of deprotonated molecules, resulting in deviations below ±5.86% for the 95% of the cases. Finally, a third model was developed for sodium adducts (CCSₙₐ, R² = 0.954) with deviation below ±5.25% for 95% of the cases. The developed models have been incorporated in an open-access and user-friendly online platform which represents a great advantage for third-party research laboratories for predicting both RT and CCS data. Refereed/Peer-reviewed
Databáze: OpenAIRE