Sparse Projection Pursuit Analysis: An Alternative for Exploring Multivariate Chemical Data
Autor: | Stephen Driscoll, Yannick S MacMillan, Peter D. Wentzell |
---|---|
Rok vydání: | 2019 |
Předmět: |
Chemistry
business.industry 010401 analytical chemistry Feature selection Pattern recognition 010402 general chemistry 01 natural sciences 0104 chemical sciences Analytical Chemistry Hierarchical clustering Principal component analysis Projection pursuit Kurtosis Artificial intelligence Projection (set theory) business Interpretability Data compression |
Zdroj: | Analytical Chemistry. 92:1755-1762 |
ISSN: | 1520-6882 0003-2700 |
DOI: | 10.1021/acs.analchem.9b03166 |
Popis: | Sparse projection pursuit analysis (SPPA), a new approach for the unsupervised exploration of high-dimensional chemical data, is proposed as an alternative to traditional exploratory methods such as principal components analysis (PCA) and hierarchical cluster analysis (HCA). Where traditional methods use variance and distance metrics for data compression and visualization, the proposed method incorporates the fourth statistical moment (kurtosis) to access interesting subspaces that can clarify relationships within complex data sets. The quasi-power algorithm used for projection pursuit is coupled with a genetic algorithm for variable selection to efficiently generate sparse projection vectors that improve the chemical interpretability of the results while at the same time mitigating the problem of overmodeling. Several multivariate chemical data sets are employed to demonstrate that SPPA can reveal meaningful clusters in the data where other unsupervised methods cannot. |
Databáze: | OpenAIRE |
Externí odkaz: |