Simultaneous model-based clustering and visualization in the Fisher discriminative subspace

Autor: Bouveyron, Charles, Brunet, Camille
Rok vydání: 2011
Předmět:
Zdroj: Statistics and Computing, 2011
Druh dokumentu: Working Paper
DOI: 10.1007/s11222-011-9249-9
Popis: Clustering in high-dimensional spaces is nowadays a recurrent problem in many scientific domains but remains a difficult task from both the clustering accuracy and the result understanding points of view. This paper presents a discriminative latent mixture (DLM) model which fits the data in a latent orthonormal discriminative subspace with an intrinsic dimension lower than the dimension of the original space. By constraining model parameters within and between groups, a family of 12 parsimonious DLM models is exhibited which allows to fit onto various situations. An estimation algorithm, called the Fisher-EM algorithm, is also proposed for estimating both the mixture parameters and the discriminative subspace. Experiments on simulated and real datasets show that the proposed approach performs better than existing clustering methods while providing a useful representation of the clustered data. The method is as well applied to the clustering of mass spectrometry data.
Databáze: arXiv