DIRAC Site Director: Improving Pilot-Job provisioning on grid resources
Autor: | Alexandre F. Boyer, Christophe Haen, Federico Stagni, David R.C. Hill |
---|---|
Přispěvatelé: | Institut Supérieur d'Informatique, de Modélisation et de leurs Applications (ISIMA), Université Clermont Auvergne [2017-2020] (UCA [2017-2020]), Laboratoire d'Informatique, de Modélisation et d'Optimisation des Systèmes (LIMOS), Ecole Nationale Supérieure des Mines de St Etienne (ENSM ST-ETIENNE)-Université Clermont Auvergne [2017-2020] (UCA [2017-2020])-Centre National de la Recherche Scientifique (CNRS) |
Jazyk: | angličtina |
Rok vydání: | 2022 |
Předmět: | |
Zdroj: | Future Generation Computer Systems Future Generation Computer Systems, 2022, 133, pp.23-38. ⟨10.1016/j.future.2022.03.002⟩ |
ISSN: | 0167-739X |
DOI: | 10.1016/j.future.2022.03.002⟩ |
Popis: | International audience; To study the constituents of matter, CERN mainly relies on the Worldwide LHC Computing Grid (WLCG), which processes petabytes of data coming from the Large Hadron Collider (LHC). LHC experiments have adopted the Pilot-Job paradigm, and deliver tools to supply grid resources with Pilot-Jobs, to efficiently leverage the computing power offered by WLCG. This sole approach will be insufficient and will need to be complemented to meet future computing needs-of the High-Luminosity LHC-and the rise of data generated over time: national science programs are consolidating computing resources and encourage using cloud and High-Performance Computing systems. Yet, even though they have started to integrate their workflows on such infrastructures, LHC experiments still largely depend on WLCG resources. This paper lays out an approach to increase the throughput of the jobs, on grid resources, by improving the performance of the Pilot-Job provisioning tools through a case study: the LHCb-specific solution, known as ''DIRAC Site Director''. We propose: (i) a complete analysis of the capabilities and limitations of the DIRAC Site Director; (ii) several methods to speed up its execution, including parallel processing as well as bulk operations; (iii) a comprehensive analysis of a group of Site Directors in the LHCb production environment during 12 months. With our approach, we recorded an increase of 40.86% of the number of jobs processed simultaneously per second, enabling the simultaneous management of 80,300 LHCb jobs, while only 57,000 of them could be managed before our improvements. |
Databáze: | OpenAIRE |
Externí odkaz: |