Three-way clustering method for incomplete information system based on set-pair analysis

Autor: Ruiyan Gao, Xiaoze Feng, Chunying Zhang, Hao Qin
Rok vydání: 2019
Předmět:
Zdroj: Granular Computing. 6:389-398
ISSN: 2364-4974
2364-4966
DOI: 10.1007/s41066-019-00197-z
Popis: Traditional clustering algorithms clearly assign uncertain information into a single cluster, which does not fully indicate that a cluster may not have a clear boundary. For a large number of missing data, the traditional clustering method cannot achieve a good clustering effect on these datasets. Therefore, the idea of three-way decision is introduced into the traditional k-means clustering, as a result, the knowledge of set-pair information granule be combined. This paper presents a three-way clustering method which can process missing values effectively. First, for missing values, the granularity corresponding to missing values are recorded as the degree of difference. Next, the algorithm is going to establish the distance between the samples and the clustering centers according to the set-pair theory. All samples are assigned into clusters according to the size of the distance, and the clustering results with three-way are formed, which are positive region, boundary region and negative region, which improves the structure of clustering results. The samples of positive region certainly belong to this cluster; the samples of boundary region may belong to this cluster; the samples of negative region don’t belong to this cluster; and the clustering results are represented by the three regions together. Finally, the validity of the algorithm is verified by UCI dataset great work.
Databáze: OpenAIRE