Selecting resources for distributed dataflow systems according to runtime targets

Autor:	Ilya Verbitskiy, Thomas Renner, Florian Schmidt, Odej Kao, Lauritz Thamsen
Rok vydání:	2016
Předmět:	Dataflow Computer science Model selection Distributed computing 02 engineering and technology Yarn Data modeling Set (abstract data type) 020204 information systems visual_art Spark (mathematics) 0202 electrical engineering electronic engineering information engineering visual_art.visual_art_medium Data analysis 020201 artificial intelligence & image processing Resource management (computing)
Zdroj:	IPCCC
DOI:	10.1109/pccc.2016.7820629
Popis:	Distributed dataflow systems like Spark or Flink enable users to analyze large datasets. Users create programs by providing sequential user-defined functions for a set of well-defined operations, select a set of resources, and the systems automatically distribute the jobs across these resources. However, selecting resources for specific performance needs is inherently difficult and users consequently tend to overprovision, which results in poor cluster utilization. At the same time, many important jobs are executed recurringly in production clusters. This paper presents Bell, a practical system that monitors job execution, models the scale-out behavior of jobs based on previous runs, and selects resources according to user-provided runtime targets. Bell automatically chooses between different runtime prediction models to optimally support different distributed dataflow systems. Bell is implemented as a job submission tool for YARN and, thus, works with existing cluster setups. We evaluated Bell's runtime prediction with six exemplary data analytics jobs using both Spark and Flink. We present the learned scale-out models for these jobs and evaluate the relative prediction error using cross-validation, showing that our model selection approach provides better overall performance than the individual prediction models.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::7062cc58d4b1899c592db5940d572b3d https://doi.org/10.1109/pccc.2016.7820629 Zobrazit plný text záznamu