Application Profiling in Hierarchical Hadoop for Geo-distributed Computing Environments

Autor: Orazio Tomarchio, Carmelo Polito, Marco Cavallo, Giuseppe Di Modica
Přispěvatelé: Cavallo Marco, Di Modica Giuseppe, Polito Carmelo, Tomarchio O
Jazyk: angličtina
Rok vydání: 2016
Předmět:
Zdroj: ISCC
Popis: In the past two decades there has been a growing interest over the definition of new distributed computational paradigms capable to serve the need of manipulating and analyzing huge amounts of data. Among the others, the MapReduce outstands for popularity. Its open-source implementation Hadoop is widely used in academic environments and is also greatly supported by huge IT players. There are many application scenarios where the data to be manipulated resides on data centers which are heterogeneous in term of computing capacity and are geographically distant from each other's. Unfortunately, in this contexts Hadoop performs very poorly. In this paper we propose to leverage on a hierarchical computing framework to boost the Hadoop performance in geo-distributed computing environments. The framework we propose drains fresh information from the distributed computing context and exploits it to carry out a smart job scheduling strategy. In this work, the focus is put on the study and definition of the application profile of the jobs. We implemented a software prototype of the proposed hierarchical Hadoop framework. Tests run on the prototype proved the capability of the job scheduling system to compute the job's execution path and estimate its completion time.
Databáze: OpenAIRE