Automatic load-balance method for coupled Earth System Models
Autor: | Palomas Martinez, Sergi |
---|---|
Přispěvatelé: | Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Acosta Cobos, Mario César, Tourigny, Etienne, Álvarez Martínez, Carlos |
Jazyk: | angličtina |
Rok vydání: | 2022 |
Předmět: |
load-balance
rendiment Informàtica::Enginyeria del software [Àrees temàtiques de la UPC] distribució de càrrega Climatologia--Simulació per ordinador ESMs Climatology--Computer simulation HPC MPI High performance computing models de ciències de la terra Càlcul intensiu (Informàtica) performance computació d'altes prestacions |
Popis: | Earth System Models (ESMs) are complex models used to simulate the Earth climate and are commonly built from different independent components that simulate a specific natural phenomenon (ocean dynamics, atmospheric dynamics, atmospheric chemistry, land and ocean biosphere, etc.). To simulate the interactions between these processes, ESMs use coupling libraries that manage the synchronization and field exchanges between the independent components, running in parallel in a typical Multi Program, Multiple Data (MPMD) application. The performance achieved depends on the coupling approach, and on the number of parallel resources and scalability properties of each component. Finding the best number of resources to use for each component of coupled ESMs is crucial to use the parallel resources efficiently. However, it is still a task involving manually testing multiple process allocations by trial and error, leading to configurations that are sub-optimal given that the dependencies between the constituents are complex and models do not scale perfectly. This project presents a methodology to find the optimal number of resources to allocate for each component to achieve the best computational performance for the coupled ESM, minimizing the cost of executing each of the constituents, which may not run at individual optimal configurations, and the waiting time due to the synchronizations between them. To achieve this, a number of novel metrics were designed and implemented in order to identify the component(s) acting as bottleneck(s) and to evaluate the performance of the coupled execution according to different Energy-To-Solution (ETS) / Time-To-Solution (TTS) tradeoff criteria. The methodology has been tested against multiple resource configurations used for the widely known ESM in Europe: EC-Earth3. The results show that some configurations could run up to 34% faster and reduce the execution cost by 6.7%. Moreover, the method has been contrasted against a configuration used for the Coupled Model Intercomparison Project Phase 6 (CMIP6)) and achieved a set-up 5% faster and 1% less costly. Lastly, the work has been integrated into a workflow manager to automatize the tasks, involving minimum user intervention. |
Databáze: | OpenAIRE |
Externí odkaz: |