A Framework for Parallelizing Hierarchical Clustering Methods

Autor: Benjamin Moseley, Kefu Lu, Thomas Lavastida, Silvio Lattanzi
Rok vydání: 2020
Předmět:
Zdroj: Machine Learning and Knowledge Discovery in Databases ISBN: 9783030461492
ECML/PKDD (1)
DOI: 10.1007/978-3-030-46150-8_5
Popis: Hierarchical clustering is a fundamental tool in data mining, machine learning and statistics. Popular hierarchical clustering algorithms include top-down divisive approaches such as bisecting k-means, k-median, and k-center and bottom-up agglomerative approaches such as single-linkage, average-linkage, and centroid-linkage. Unfortunately, only a few scalable hierarchical clustering algorithms are known, mostly based on the single-linkage algorithm. So, as datasets increase in size every day, there is a pressing need to scale other popular methods.
Databáze: OpenAIRE