Improved classification of fused data: Synergetic effect of partial least squares discriminant analysis (PLS-DA) and common components and specific weights analysis (CCSWA) combination as applied to tomato profiles (NMR, IR and IRMS)

Autor: Helmut Wachter, Douglas N. Rutledge, Yulia B. Monakhova, Norbert Christoph, Monika Hohmann
Přispěvatelé: Spectral Service AG, Institute of Chemistry, Saratov State University, Institute of Pharmacy and Food Chemistry, Julius-Maximilians-Universität Würzburg [Wurtzbourg, Allemagne] (JMU), Bavarian Health and Food Safety Authority, Ingénierie, Procédés, Aliments (GENIAL), Institut National de la Recherche Agronomique (INRA)-AgroParisTech, Russian President's grant for young scientists [MK-6226.2016.3], Bavarian Health and Food Safety Authority [Oberschleißheim, Germany]
Jazyk: angličtina
Rok vydání: 2016
Předmět:
Zdroj: Chemometrics and Intelligent Laboratory Systems
Chemometrics and Intelligent Laboratory Systems, Elsevier, 2016, 156, pp.1-6. ⟨10.1016/j.chemolab.2016.05.006⟩
ISSN: 0169-7439
DOI: 10.1016/j.chemolab.2016.05.006⟩
Popis: Discriminant analysis (DA) methods are well-known chemometric approaches for solving classification problems in chemistry. Recently, specific multiblock methods, such as common components and specific weights analysis (CCSWA), have been developed which make it possible to enhance the quality of the classification models, by combining data from different analytical platforms. In this study we propose a new data fusion methodology PLS-DA-CCSWA, which combines the discriminant power of the PLS-DA method with the capability of CCSWA to extract the maximum of useful information from the different data blocks. A large dataset (n = 112) of H-1 NMR, infrared and isotope ratio mass spectral profiles of authentic tomato samples was analyzed to demonstrate the principle. The classification model developed was used to predict the tomato production type (organic or conventional). The application of the new method resulted in improved classification performance for test set samples according to the Wilks' lambda test. Moreover, a clear decrease in the standard deviations of the predicted Y-values was observed going from 024 to 0.18 on average between the classical CCSWA and the PLS-DA-CCSWA, respectively. The procedure to determine the number of common components and the number of latent variables is discussed. The PLS-DA-CCSWA method is shown to be preferable to separate PLS-DA and CCSWA approaches for classification based on fused spectroscopic measurements.
Databáze: OpenAIRE