An Information-Theoretical Framework for Cluster Ensemble

Autor: Hangyuan Du, Jiye Liang, Liang Bai, Yike Guo
Rok vydání: 2018
Předmět:
Zdroj: IEEE Transactions on Knowledge and Data Engineering. :1-1
ISSN: 2326-3865
1041-4347
DOI: 10.1109/tkde.2018.2865954
Popis: Cluster ensemble is a very important tool that aggregates several base clusterings to generate a single output clustering with improved robustness and stability. However, the quality of the final clustering is often affected by uncertainties on the generation and integration of base clusterings. In this paper, we develop an information-theoretical framework which makes an effort to obtain a final clustering with high consensus on both the original data set and the base clustering set by minimizing the two uncertainties of cluster ensemble. In this framework, we provide a weighted consensus measure based on information entropy to evaluate the quality of a clustering, the similarity between clusters and the similarity between objects. Based on the measure, we propose three weighted cluster ensemble algorithms with different ensemble strategies in the framework, including the weighted feature consensus algorithm, the weighted relabeling consensus algorithm and the weighted pairwise-similarity consensus algorithm. In the experimental analysis, we compare the proposed algorithms with other existing clustering ensemble algorithms on several data sets. The comparison results illustrate the proposed algorithms are very effective and robust.
Databáze: OpenAIRE