An uncertainty perspective to PCM and APCM clustering
Autor: | Shuguang Liu, Jiguang Yue, Peixin Hou, Hao Deng |
---|---|
Rok vydání: | 2018 |
Předmět: |
Computer science
Applied Mathematics Fuzzy set Closeness 02 engineering and technology 01 natural sciences Theoretical Computer Science Determining the number of clusters in a data set Artificial Intelligence 0103 physical sciences 0202 electrical engineering electronic engineering information engineering Cluster (physics) 020201 artificial intelligence & image processing Noise level 010306 general physics Cluster analysis Algorithm Merge (version control) Software |
Zdroj: | International Journal of Approximate Reasoning. 95:194-212 |
ISSN: | 0888-613X |
DOI: | 10.1016/j.ijar.2018.02.006 |
Popis: | Possibilistic c-means (PCM) based clustering algorithms are widely used in the literature. Recently, adaptive PCM (APCM) is proposed to adapt the bandwidth at each iteration and the cluster merge is automatically achieved. The cluster elimination ability of APCM makes PCM very flexible to set the initial cluster number m i n i . However, this comes at a price of introducing another parameter α which ranges in ( 0 , + ∞ ) . This study tries to utilize the uncertainty in the data to achieve more control over the clustering process by appropriately characterizing the uncertainty of memberships via the conditional fuzzy set. This uncertainty perspective motivates us to introduce parameters σ v and α to characterize uncertainty of estimated bandwidth and noise level of the dataset respectively, which results in a unified framework of PCM and APCM (UPCM). UPCM is further developed by eliminating the σ v parameter, then we get PCM clustering based on noise level (NPCM). As a result, the algorithm needs two kinds of information that is intuitive to specify for the clustering task, i.e., information of the cluster number and information of the property of clusters, and they are represented by two parameters, i.e., m i n i specifies the possibly over-specified cluster number, and α characterizes the closeness of clusters in the clustering result. Both parameters are not required to be exactly specified, and α ranges in [ 0 , 1 ] . Experiments show that the clustering process can be effectively controlled by the parameters. |
Databáze: | OpenAIRE |
Externí odkaz: |