An approach for realistically simulating the performance of scientific applications on high performance computing systems
Autor: | Franziska Kasielke, Ioana Banicescu, Ali Mohammed, Florina M. Ciorba, Ahmed Eleliemy |
---|---|
Rok vydání: | 2020 |
Předmět: |
FOS: Computer and information sciences
Computer Science - Performance Computer Networks and Communications Computer science 020206 networking & telecommunications 02 engineering and technology Load balancing (computing) Supercomputer Exascale computing Scheduling (computing) Performance (cs.PF) Computer Science - Distributed Parallel and Cluster Computing Computer engineering Hardware and Architecture 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Distributed Parallel and Cluster Computing (cs.DC) Software |
Zdroj: | Future Generation Computer Systems. 111:617-633 |
ISSN: | 0167-739X |
DOI: | 10.1016/j.future.2019.10.007 |
Popis: | Scientific applications often contain large, computationally-intensive, and irregular parallel loops or tasks that exhibit stochastic behavior leading to load imbalance. Load imbalance often manifests during the execution of parallel scientific applications on large and complex high performance computing (HPC) systems. The extreme scale of HPC systems on the road to Exascale computing only exacerbates the loss in performance due to load imbalance. Dynamic loop self-scheduling (DLS) techniques are instrumental in improving the performance of scientific applications on HPC systems via load balancing. Selecting a DLS technique that results in the best performance for different problem and system sizes requires a large number of exploratory experiments. Currently, a theoretical model that can be used to predict the scheduling technique that yields the best performance for a given problem and system has not yet been identified. Therefore, simulation is the most appropriate approach for conducting such exploratory experiments in a reasonable amount of time. However, conducting realistic and trustworthy simulations of application performance under different configurations is challenging. This work devises an approach to realistically simulate computationally-intensive scientific applications that employ DLS and execute on HPC systems. The proposed approach minimizes the sources of uncertainty in the simulative experiments results by bridging the native and simulative experimental approaches. A new method is proposed to capture the variation of application performance between different native executions. Several approaches to represent the application tasks (or loop iterations) are compared to establish their influence on the simulative application performance. A novel simulation strategy is introduced that applies the proposed approach, which transforms a native application code into simulative code. The native and simulative performance of two computationally-intensive scientific applications that employ eight task scheduling techniques (static, nonadaptive dynamic, and adaptive dynamic) are compared to evaluate the realism of the proposed simulation approach. The comparison of the performance characteristics extracted from the native and simulative performance shows that the proposed simulation approach fully captured most of the performance characteristics of interest. This work shows and establishes the importance of simulations that realistically predict the performance of DLS techniques for different applications and system configurations. |
Databáze: | OpenAIRE |
Externí odkaz: |