Big Data Workflows: Locality-Aware Orchestration Using Software Containers

Autor: Corodescu, Andrei-Alin, Nikolov, Nikolay, Khan, Akif Quddus, Soylu, Ahmet, Matskin, Mihhail, Payberah, Amir H., Roman, Dumitru
Jazyk: angličtina
Rok vydání: 2021
Předmět:
Zdroj: Sensors, Vol 21, Iss 8212, p 8212 (2021)
21:8212
Sensors
Sensors (Basel, Switzerland)
Sensors; Volume 21; Issue 24; Pages: 8212
ISSN: 1424-8220
1010-1683
Popis: The emergence of the Edge computing paradigm has shifted data processing from centralised infrastructures to heterogeneous and geographically distributed infrastructures. Therefore, data processing solutions must consider data locality to reduce the performance penalties from data transfers among remote data centres. Existing Big Data processing solutions provide limited support for handling data locality and are inefficient in processing small and frequent events specific to the Edge environments. This article proposes a novel architecture and a proof-of-concept implementation for software container-centric Big Data workflow orchestration that puts data locality at the forefront. The proposed solution considers the available data locality information, leverages long-lived containers to execute workflow steps, and handles the interaction with different data sources through containers. We compare the proposed solution with Argo Workflows and demonstrate a significant performance improvement in the execution speed for processing the same data units. Finally, we carry out experiments with the proposed solution under different configurations and analyze individual aspects affecting the performance of the overall solution. The work in this paper was partly funded by the EC H2020 project “DataCloud ” (grant number 101016835) and the NFR project “BigDataMine” (grant number 309691).
Databáze: OpenAIRE