Distributed Hierarchal Clustering Algorithm Utilizing a Distance Matrix

Autor: Simon Dexter, Gavriel Yarmish, Philip Listowsky
Rok vydání: 2017
Předmět:
Zdroj: 2017 International Conference on Computational Science and Computational Intelligence (CSCI).
DOI: 10.1109/csci.2017.282
Popis: Dividing similar objects into a smaller number of clusters is of importance in many applications. These include search engines, monitoring of academic performance, biology and wireless networks. We first discuss a number of clustering methods. We present a parallel algorithm for the efficient clustering of proteins into groups. The input consists of an n by n distance matrix. This matrix would be built differently for different applications. A two simple points in space can have the Euclidean distance in the matrix. As another example, the Root-Mean-Square-Deviations (RMSD) values can be computed for any two 3-D structures and used and the distance between them. The second step is to utilize parallel processors to calculate a hierarchal cluster of these n items based on this matrix. We have implemented our algorithm and have found it to be scalable.
Databáze: OpenAIRE