Application Profiling in Hierarchical Hadoop for Geo-distributed Computing Environments
Autor: | Orazio Tomarchio, Carmelo Polito, Marco Cavallo, Giuseppe Di Modica |
---|---|
Přispěvatelé: | Cavallo Marco, Di Modica Giuseppe, Polito Carmelo, Tomarchio O |
Jazyk: | angličtina |
Rok vydání: | 2016 |
Předmět: |
Job scheduler
Database Distributed database business.industry Computer science Distributed computing Big data Processor scheduling 020207 software engineering Context (language use) 02 engineering and technology computer.software_genre 020202 computer hardware & architecture distributed computing big data 0202 electrical engineering electronic engineering information engineering Data-intensive computing Profiling (information science) business computer |
Zdroj: | ISCC |
Popis: | In the past two decades there has been a growing interest over the definition of new distributed computational paradigms capable to serve the need of manipulating and analyzing huge amounts of data. Among the others, the MapReduce outstands for popularity. Its open-source implementation Hadoop is widely used in academic environments and is also greatly supported by huge IT players. There are many application scenarios where the data to be manipulated resides on data centers which are heterogeneous in term of computing capacity and are geographically distant from each other's. Unfortunately, in this contexts Hadoop performs very poorly. In this paper we propose to leverage on a hierarchical computing framework to boost the Hadoop performance in geo-distributed computing environments. The framework we propose drains fresh information from the distributed computing context and exploits it to carry out a smart job scheduling strategy. In this work, the focus is put on the study and definition of the application profile of the jobs. We implemented a software prototype of the proposed hierarchical Hadoop framework. Tests run on the prototype proved the capability of the job scheduling system to compute the job's execution path and estimate its completion time. |
Databáze: | OpenAIRE |
Externí odkaz: |