An Information-Theoretical Framework for Cluster Ensemble
Autor: | Hangyuan Du, Jiye Liang, Liang Bai, Yike Guo |
---|---|
Rok vydání: | 2018 |
Předmět: |
Computer science
02 engineering and technology computer.software_genre Computer Science Applications Data set ComputingMethodologies_PATTERNRECOGNITION Computational Theory and Mathematics Robustness (computer science) 020204 information systems 0202 electrical engineering electronic engineering information engineering Cluster (physics) Entropy (information theory) Data mining Cluster analysis computer Information Systems |
Zdroj: | IEEE Transactions on Knowledge and Data Engineering. :1-1 |
ISSN: | 2326-3865 1041-4347 |
DOI: | 10.1109/tkde.2018.2865954 |
Popis: | Cluster ensemble is a very important tool that aggregates several base clusterings to generate a single output clustering with improved robustness and stability. However, the quality of the final clustering is often affected by uncertainties on the generation and integration of base clusterings. In this paper, we develop an information-theoretical framework which makes an effort to obtain a final clustering with high consensus on both the original data set and the base clustering set by minimizing the two uncertainties of cluster ensemble. In this framework, we provide a weighted consensus measure based on information entropy to evaluate the quality of a clustering, the similarity between clusters and the similarity between objects. Based on the measure, we propose three weighted cluster ensemble algorithms with different ensemble strategies in the framework, including the weighted feature consensus algorithm, the weighted relabeling consensus algorithm and the weighted pairwise-similarity consensus algorithm. In the experimental analysis, we compare the proposed algorithms with other existing clustering ensemble algorithms on several data sets. The comparison results illustrate the proposed algorithms are very effective and robust. |
Databáze: | OpenAIRE |
Externí odkaz: |