Bisimulation reduction of big graphs on MapReduce

Autor: Luo, Y., Lange, de, Y., Fletcher, G.H.L., De Bra, P.M.E., Hidders, A.J.H., Gottlob, G., Grasso, G., Olteanu, D., Schallhart, C.
Přispěvatelé: Mathematics and Computer Science, Database Group
Jazyk: angličtina
Rok vydání: 2013
Předmět:
Zdroj: Big Data (29th British National Conference on Databases, BNCOD 2013, Oxford, UK, July 8-10, 2013. Proceedings), 189-203
STARTPAGE=189;ENDPAGE=203;TITLE=Big Data (29th British National Conference on Databases, BNCOD 2013, Oxford, UK, July 8-10, 2013. Proceedings)
Big Data ISBN: 9783642394669
BNCOD
Popis: Computing the bisimulation partition of a graph is a fundamental problem which plays a key role in a wide range of basic applications. Intuitively, two nodes in a graph are bisimilar if they share basic structural properties such as labeling and neighborhood topology. In data management, reducing a graph under bisimulation equivalence is a crucial step, e.g., for indexing the graph for efficient query processing. Often, graphs of interest in the real world are massive; examples include social networks and linked open data. For analytics on such graphs, it is becoming increasingly infeasible to rely on in-memory or even I/O-efficient solutions. Hence, a trend in Big Data analytics is the use of distributed computing frameworks such as MapReduce. While there are both internal and external memory solutions for efficiently computing bisimulation, there is, to our knowledge, no effective MapReduce-based solution for bisimulation. Motivated by these observations we propose in this paper the first efficient MapReduce-based algorithm for computing the bisimulation partition of massive graphs. We also detail several optimizations for handling the data skew which often arises in real-world graphs. The results of an extensive empirical study are presented which demonstrate the effectiveness and scalability of our solution.
Databáze: OpenAIRE