Bi-Objective CSO for Big Data Scientific Workflows Scheduling in the Cloud: Case of LIGO Workflow

Autor: Khadija Bousselmi, Marta Rukoz, S. Ben Hamida
Přispěvatelé: Université Savoie Mont Blanc (USMB [Université de Savoie] [Université de Chambéry]), Université Paris Dauphine-PSL, Université Paris sciences et lettres (PSL), Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision (LAMSADE), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS), Université Paris Nanterre - UFR Sciences économiques, gestion, mathématiques, informatique (UPN SEGMI), Université Paris Nanterre (UPN)
Jazyk: angličtina
Rok vydání: 2020
Předmět:
Zdroj: 15th International Conference on Software Technologies
15th International Conference on Software Technologies, Jul 2020, Lieusaint-Paris, France. pp.615-624, ⟨10.5220/0009827106150624⟩
Proceedings of the 15th International Conference on Software Technologies (ICSOFT 2020)
ICSOFT
DOI: 10.5220/0009827106150624⟩
Popis: International audience; Scientific workflows are used to model scalable, portable, and reproducible big data analyses and scientific experiments with low development costs. To optimize their performances and ensure data resources efficiency, scientific workflows handling big volumes of data need to be executed on scalable distributed environments like the Cloud infrastructure services. The problem of scheduling such workflows is known as an NP-complete problem. It aims to find optimal mapping task-to-resource and data-to-storage resources in order to meet end user's quality of service objectives, especially minimizing the overall makespan or the financial cost of the workflow. In this paper, we formulate the problem of scheduling big data scientific workflows as bi-objective optimization problem that aims to minimize both the makespan and the cost of the workflow. The formulated problem is then resolved using our proposed Bi-Objective Cat Swarm Optimization algorithm (BiO-CSO) which is an extension of the bio-inspired algorithm CSO. The extension consists of adapting the algorithm to solve multi-objective discrete optimization problems. Our application case is the LIGO Inspiral workflow which is a CPU and Data intensive workflow used to generate and analyze gravitational waveforms from data collected during the coalescing of compact binary systems. The performance of the proposed method is then compared to that of the multi-objective Particle Swarm Optimization (PSO) proven to be effective for scientific workflows scheduling. The experimental results show that our algorithm BiO-CSO performs better than the multi-objective PSO since it provides more and better final scheduling solutions.
Databáze: OpenAIRE