Efficiency of Cluster Validity Indexes in Fuzzy Clusterwise Generalized Structured Component Analysis
Autor: | Seohee Park, Seongeun Kim, Ji Hoon Ryoo, Hyun Suk Ryoo |
---|---|
Rok vydání: | 2020 |
Předmět: |
Fuzzy clustering
Physics and Astronomy (miscellaneous) Computer science General Mathematics Population FIT-FHV method generalized structured component analysis computer.software_genre structural equation modeling Fuzzy logic Structural equation modeling 03 medical and health sciences 0302 clinical medicine 0504 sociology Component analysis Computer Science (miscellaneous) Cluster (physics) education education.field_of_study lcsh:Mathematics 05 social sciences 050401 social sciences methods Centroid Statistical model lcsh:QA1-939 cluster validity problem Chemistry (miscellaneous) fuzzy hypervolume validity index fuzzy clustering Data mining computer 030217 neurology & neurosurgery |
Zdroj: | Symmetry Volume 12 Issue 9 Symmetry, Vol 12, Iss 1514, p 1514 (2020) |
ISSN: | 2073-8994 |
DOI: | 10.3390/sym12091514 |
Popis: | Fuzzy clustering has been broadly applied to classify data into K clusters by assigning membership probabilities of each data point close to K centroids. Such a function has been applied into characterizing the clusters associated with a statistical model such as structural equation modeling. The characteristics identified by the statistical model further define the clusters as heterogeneous groups selected from a population. Recently, such statistical model has been formulated as fuzzy clusterwise generalized structured component analysis (fuzzy clusterwise GSCA). The same as in fuzzy clustering, the clusters are enumerated to infer the population and its parameters within the fuzzy clusterwise GSCA. However, the identification of clusters in fuzzy clustering is a difficult task because of the data-dependence of classification indexes, which is known as a cluster validity problem. We examined the cluster validity problem within the fuzzy clusterwise GSCA framework and proposed a new criterion for selecting the most optimal number of clusters using both fit indexes of the GSCA and the fuzzy validity indexes in fuzzy clustering. The criterion, named the FIT-FHV method combining a fit index, FIT, from GSCA and a cluster validation measure, FHV, from fuzzy clustering, performed better than any other indices used in fuzzy clusterwise GSCA. |
Databáze: | OpenAIRE |
Externí odkaz: |