Digital twins, the journey of an operational weather system into the heart of Destination Earth.

Autor: Geenen, Thomas, Wedi, Nils, Milinski, Sebastian, Hadade, Ioan, Reuter, Balthasar, Smart, Simon, Hawkes, James, Kuwertz, Emma, Quintino, Tiago, Danovaro, Emanuele, Sarmany, Domokos, Aguridan, Razvan, Maciel, Pedro, Suttie, Martin, Duma, Cristina, Griffith, Matthew, Burton, Paul, Bennet, Andrew, Horvjar, Tryggvi, Hernandez, Bentorey
Předmět:
Zdroj: Procedia Computer Science; 2024, Vol. 240, p99-108, 10p
Abstrakt: Moving a world leading numerical weather prediction system that runs on a dedicated, bespoke, high performance computing cluster and supporting infrastructure, into the heart of a digital twin for climate change adaptation and extreme weather events has been a challenging and exciting journey. In this paper we describe this journey with a focus on those aspects required to leverage the pre-exascale EuroHPC systems that have been made available to Destination Earth (DestinE) to run its computational representation[1]. EuroHPC systems can be effectively used for DestinE and are in fact key assets to deliver the computational power required for Earth system digital twins at global km-scale resolution. At the same time, EuroHPC systems were newly installed and procedures to run them efficiently are evolving. We find that each of these systems is operated by a national hosting entity that implements its own procedures, e.g. for identity and access management, specific system configuration like schedulers, filesystems, software management systems, and specific, sometimes vendor associated, toolchains, tooling, and container runtimes. In particular, the different scheduling policies encountered, required us to adapt our workflows for each site. We found that having dedicated resources available, which was trialed in a period from 16th February to 14th April on LUMI, allowed to achieve high occupation rates, with 92% on the reserved GPU allocation and greater than 97% efficiency on the CPU reservation. Also a stronger focus on federation of these systems, with a focus not only on federation of identities and accounts, but also in the areas of data ownership/transfer, observability, services and service accounts, maintenance coordination and performance portability is required. In general, it should become much easier to transfer a workload, or a digital twin system, from one EuroHPC site to the next and run and maintain them across several sites concurrently. [ABSTRACT FROM AUTHOR]
Databáze: Supplemental Index