A Smart Intermediate Data Scheduling Policy Based on Computation Resources Sharing to Alleviate Negative Performance Impact of Intermediate Data Skew in Small-Scale MapReduce Cloud

Autor: Lin, Jia-Huei, 林家輝
Rok vydání: 2018
Druh dokumentu: 學位論文 ; thesis
Popis: 106
A MapReduce cloud becomes the basic infrastructure on cloud computing today. Because applications may process various input data with different ways to produce intermediate data, a MapReduce cloud may has intermediate data skew stemming from unevenly distributing intermediate data among computers at run time. When intermediate data skew happens, a MapReduce cloud not only idles computers to waste computation resources but also prolongs the application execution time. Instead of the existing solutions that assume many available idle computers and use computation resources in a loose way, we propose the Smart Intermediate Data Scheduling Policy (SIDSP) in this thesis to alleviate the negative performance impact of intermediate data skew in a small-scale MapReduce cloud. Besides, we test SIDSP with popular applications and compare it to two systems when intermediate data is evenly distributed and seriously skewed.
Databáze: Networked Digital Library of Theses & Dissertations