PGCAS: A Parallelized Graph Clustering Algorithm Based on Spark
Autor: | Jianhui Li, Dongjiang Liu |
---|---|
Rok vydání: | 2019 |
Předmět: | |
Zdroj: | Big Scientific Data Management ISBN: 9783030280604 BigSDM |
DOI: | 10.1007/978-3-030-28061-1_20 |
Popis: | Nowadays plenty of data are in graph format. For example, knowledge graph use vertices to represent entities and use edges to represent relations between entities; graph data in microbiology contain microorganisms and relations between them etc. So information can be obtained by graph mining from these data. Graph clustering is a part of graph mining. Recent years, many graph clustering algorithms have been proposed. But most of them are Sequential Algorithms. So they cannot run in distributed environment. In this case the volume of data that can be processed by the algorithms is limited. In this paper we propose a new parallelized graph clustering algorithm based on Spark. And some methods have been adopted in the algorithm to improve its running speed. From the experimental results we can find that the proposed algorithm is better than the parallelized graph clustering algorithm for comparison. |
Databáze: | OpenAIRE |
Externí odkaz: |