A framework to uncover multiple alternative clusterings.

Autor: Dang, Xuan, Bailey, James
Předmět:
Zdroj: Machine Learning; Jan2015, Vol. 98 Issue 1-2, p7-30, 24p
Abstrakt: Clustering is often referred to as unsupervised learning which aims at uncovering hidden structures from data. Unfortunately, though widely being used as one of the principal tools to understand the data, most conventional clustering techniques are limited in achieving this goal since they only attempt to find a single clustering solution from the data. For many real-world applications, especially those being described in high dimensional data, it is common to see that the data can be grouped into different yet meaningful ways. This gives rise to the recently emerging research area of mining alternative clusterings. In this paper, we propose a framework named MACL that is capable of discovering multiple alternative clusterings from a given dataset. MACL seeks alternative clusterings in sequence and a novel solution is found by conditioning on all previously known clusterings. The framework takes a mathematically appealing approach by combining the maximum likelihood framework and mutual information. Consequently, its resultant clustering quality is achieved by the likelihood maximization over the data whereas the dissimilarity is ensured by the minimization over the information sharing amongst alternatives. We test the proposed algorithm on both synthetic and real-world datasets and the experimental results demonstrate its potential in discovering multiple alternative clusterings from data. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index