Performance analysis of clustering internal validation indexes with asymmetric clusters
Autor: | N. Duro, Marco Mora, Juan Carlos Rojas-Thomas, Matilde Santos |
---|---|
Rok vydání: | 2019 |
Předmět: |
Index (economics)
General Computer Science Relation (database) Computer science 05 social sciences 050301 education 02 engineering and technology Variance (accounting) computer.software_genre Measure (mathematics) Set (abstract data type) Tree (data structure) Approximation error 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Data mining Electrical and Electronic Engineering Cluster analysis 0503 education computer |
Zdroj: | IEEE Latin America Transactions. 17:807-814 |
ISSN: | 1548-0992 |
DOI: | 10.1109/tla.2019.8891949 |
Popis: | The present work evaluates the performance of a set of internal clustering indexes in artificial and real data sets regarding a specific structural characteristic. In particular, it deals with data sets whose clusters present asymmetric characteristics in their geometries. With this objective, the concept of symmetry is formalized with respect to the axes of maximum variance of the clusters by the definition of a new index. Then a novel methodology is proposed to evaluate the performance of the internal indexes for crisp clustering in sets with this specific structural characteristic. The new defined index is combined with the correlation analysis, allowing to evaluate dynamically the performance of 11 internal indexes well known in the literature, and of a recently proposed index, the Representative Tree Index (RTI). In this way, this methodology allows us to measure not only the absolute error of the indices in relation to a particular configuration of clusters, but also the degree to which the structural characteristic of interest influences the performance of the index, obtaining a more generic understanding of their behaviors. |
Databáze: | OpenAIRE |
Externí odkaz: |