Clusterability assessment for Gaussian mixture models

Autor: Jacek Koronacki, Stan Lipovetsky, Ewa Nowakowska
Rok vydání: 2015
Předmět:
Zdroj: Applied Mathematics and Computation. 256:591-601
ISSN: 0096-3003
DOI: 10.1016/j.amc.2014.12.038
Popis: There are numerous measures designed to evaluate quality of a given data grouping in terms of its distinctness and between-cluster separation. However, there seems to be no efficient method to assess distinctness of the intrinsic structure within data (clusterability) before actual clustering is determined. Based on recent findings, we propose such measure in terms of covariance matrix decomposition for appropriately transformed data. The data is assumed to come from a Gaussian mixture model. The transformation reshapes the data so that unsupervised technique of principal component analysis is able to uncover information directly indicative of the data clusterability characteristics. In this work we propose the measure and explain the motivation as well as the relation to supervised structure distinctness coefficients. We also show how the measure can be applied for number of clusters and feature selection tasks.
Databáze: OpenAIRE