On the Consistency of k-means++ algorithm
Autor: | Mieczyslaw A. Klopotek |
---|---|
Rok vydání: | 2020 |
Předmět: |
FOS: Computer and information sciences
education.field_of_study Algebra and Number Theory Population k-means clustering 0102 computer and information sciences Expected value 01 natural sciences Machine Learning (cs.LG) Theoretical Computer Science Constant factor Computer Science - Learning Computational Theory and Mathematics 010201 computation theory & mathematics Sample size determination Consistency (statistics) Applied mathematics Cluster analysis education Information Systems Mathematics |
Zdroj: | Fundamenta Informaticae. 172:361-377 |
ISSN: | 1875-8681 0169-2968 |
Popis: | We prove in this paper that the expected value of the objective function of the $k$-means++ algorithm for samples converges to population expected value. As $k$-means++, for samples, provides with constant factor approximation for $k$-means objectives, such an approximation can be achieved for the population with increase of the sample size. This result is of potential practical relevance when one is considering using subsampling when clustering large data sets (large data bases). |
Databáze: | OpenAIRE |
Externí odkaz: |