Generate-map-reduce: An extension to map-reduce to support shared data and recursive computations

Autor: Geeta Iyer, Sriram Kailasam, Janakiram Dharanipragada
Rok vydání: 2013
Předmět:
Zdroj: Concurrency and Computation: Practice and Experience. 26:561-585
ISSN: 1532-0626
Popis: SUMMARY It is difficult to express the parallelism present in complex computations by using existing higher level abstractions such as MapReduce and Dryad. These computations include applications from wide variety of domains, like Artificial Intelligence, Decision Tree Algorithms, Association Rule Mining, Recommender Systems, Graph Algorithms, Clustering Algorithms, Compute Intensive Scientific Workflows, Optimization Algorithms, and so forth. Their execution graphs introduce new challenges in terms of programmer expressibility and runtime performance such as iterative and recursive computations, shared communication model, and so forth. We propose an extension to MapReduce, called Generate-Map-Reduce (GMR), targeted towards modeling these applications. GMR introduces a new Generate abstraction into the MapReduce framework that captures recursive computations. The runtime also supports iterative jobs and a distributed communication model by using shared data structures. We illustrate recursive computations with GMR by modeling complex applications such as simulated annealing, A* search, and adaptive quadrature computation that require recursive spawning of new tasks to handle variable degree of parallelism. GMR runtime supports caching of common data across iterations in memory and local disks. We illustrate how this caching helps in achieving significant speedup for iterative computations by modeling k-means clustering. Copyright © 2013 John Wiley & Sons, Ltd.
Databáze: OpenAIRE