New Internal Clustering Evaluation Index Based on Line Segments

Autor: Juan Carlos Rojas Thomas, Matilde Santos Peñas
Rok vydání: 2019
Předmět:
Zdroj: Intelligent Data Engineering and Automated Learning – IDEAL 2019 ISBN: 9783030336066
IDEAL (1)
DOI: 10.1007/978-3-030-33607-3_57
Popis: This work proposes a new internal clustering evaluation index, based on line segments as central elements of the clusters. The data dispersion is calculated as the average of the distances of the cluster to the respective line segment. It also defines a new measure of distance based on a line segment that connects the centroids of the clusters, from which an approximation of the edges of their geometries is obtained. The proposed index is validated with a series of experiments on 10 artificial data sets that are generated with different cluster characteristics, such as size, shape, noise and dimensionality, and on 8 real data sets. In these experiments, the performance of the new index is compared with 12 representative indices of the literature, surpassing all of them. These results allow to conclude the effectiveness of the proposal and shows the appropriateness of including geometric properties in the definition of internal indexes.
Databáze: OpenAIRE