Information Theoretic Hierarchical Clustering

Autor: Babak Nadjar Araabi, Mehdi Aghagolzadeh, Hamid Soltanian-Zadeh
Jazyk: angličtina
Rok vydání: 2011
Předmět:
Zdroj: Entropy, Vol 13, Iss 2, Pp 450-465 (2011)
Druh dokumentu: article
ISSN: 1099-4300
DOI: 10.3390/e13020450
Popis: Hierarchical clustering has been extensively used in practice, where clusters can be assigned and analyzed simultaneously, especially when estimating the number of clusters is challenging. However, due to the conventional proximity measures recruited in these algorithms, they are only capable of detecting mass-shape clusters and encounter problems in identifying complex data structures. Here, we introduce two bottom-up hierarchical approaches that exploit an information theoretic proximity measure to explore the nonlinear boundaries between clusters and extract data structures further than the second order statistics. Experimental results on both artificial and real datasets demonstrate the superiority of the proposed algorithm compared to conventional and information theoretic clustering algorithms reported in the literature, especially in detecting the true number of clusters.
Databáze: Directory of Open Access Journals