A Multi-View Clustering Algorithm for Mixed Numeric and Categorical Data
Autor: | Wei Pang, Xiaowei Zhao, Guozhong Feng, Ruonan Li, Jinchao Ji, Fei He |
---|---|
Rok vydání: | 2021 |
Předmět: |
General Computer Science
Computer science 0211 other engineering and technologies 02 engineering and technology Data clustering computer.software_genre 0202 electrical engineering electronic engineering information engineering Cluster (physics) General Materials Science numeric and categorical attributes Electrical and Electronic Engineering Representation (mathematics) Cluster analysis Categorical variable 021103 operations research Series (mathematics) General Engineering multi-view learning Term (time) Statistical classification mixed data Benchmark (computing) 020201 artificial intelligence & image processing lcsh:Electrical engineering. Electronics. Nuclear engineering Data mining lcsh:TK1-9971 computer |
Zdroj: | IEEE Access, Vol 9, Pp 24913-24924 (2021) |
ISSN: | 2169-3536 |
Popis: | Clustering data with both numeric and categorical attributes is of great importance as such data are ubiquitous in real-world problems. Multi-view learning approaches have proven to be more effective and having better generalisation ability compared to single-view learning in many problems. However, most of the existing clustering algorithms developed for mixed numeric and categorical data are single-view. In this research, we propose a novel multi-view clustering algorithm based on the k-prototypes (which we term Multi-view K-Prototypes) for clustering mixed data. To the best of our knowledge, our proposed Multi-view K-Prototypes is the first multi-view version of the well-known k-prototypes algorithm. To cluster the mixed data over multiple views, we present a novel representation prototype of cluster centres in the scenario of multiple views, and we also devise formulas for updating the cluster centres over each view. Then we propose the concept of consensus cluster centres to output the final clustering result. Finally, we carried out a series of experiments on four benchmark datasets to assess the performance of the proposed Multi-view K-Prototypes clustering. Experimental results show that the Multi-view K-Prototypes algorithm outperforms the seven state-of-the-art algorithms in most cases. |
Databáze: | OpenAIRE |
Externí odkaz: |