Popis: |
Choosing the right configuration for Spark deployed in the public cloud to ensure the efficient running of periodic jobs is hard, because there can be a huge configuration space to explore which is composed of numerous performance-related parameters in different dimensions (e.g., application-level and cloud-level). Choosing poorly will not only significantly degrade performance but may also lead to greater overhead. However, automatically searching for the optimal configuration of various applications to trade-off performance and cost is challenging. To address this issue, we propose a new optimal configuration search algorithm named AB-MOEA/D by combining multi-objective optimization algorithm and performance prediction model. AB-MOEA/D uses a decomposition-based multi-objective optimization algorithm to find the configuration with the objective of minimizing the execution time and cost, where the performance model constructed on the Adaboost algorithm is used to evaluate the fitness of each candidate configuration. Besides, we also present the configuration automatic tuning system with AB-MOEA/D as the optimization engine. The experimental results on six benchmarks with five data sets show that AB-MOEA/D significantly outperforms the previous work in terms of execution time and cost, with average improvements of approximately 35 and 40 percent. |