Generating Optimum Number of Clusters Using Median Search and Projection Algorithms

Autor: Rajappa Veluru, L Suresh, Jay B. Simha
Rok vydání: 2010
Předmět:
Zdroj: AINA Workshops
DOI: 10.1109/waina.2010.196
Popis: K-means Clustering is an important algorithm for identifying the structure in data. Kmeans is the simplest clustering algorithm. This algorithm takes a predefined number of clusters as input. Mean stands for an average, an average location of all the members of a particular cluster. This algorithm is based on random selection of cluster centers and iteratively improving the results. In this work, a novel approach to seeding the clusters with the latent data structure is proposed. This is expected to minimize: The need for number of clusters apriory Time for convergence by providing near optimal cluster centers. Also these algorithms are tested on the latest standards for data warehouses -- the column store databases.
Databáze: OpenAIRE