A benchmark-based performance model for memory-bound HPC applications

Autor:	Brice Goglin, Denis Barthou, Bertrand Putigny
Přispěvatelé:	Efficient runtime systems for parallel architectures (RUNTIME), Inria Bordeaux - Sud-Ouest, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS), Laboratoire Bordelais de Recherche en Informatique (LaBRI), Université de Bordeaux (UB)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Centre National de la Recherche Scientifique (CNRS), Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)
Rok vydání:	2014
Předmět:	Cache coloring Computer science multicore Cache-only memory architecture Parallel computing Cache pollution Cache-oblivious algorithm timing prediction Non-uniform memory access caches [INFO.INFO-OS]Computer Science [cs]/Operating Systems [cs.OS] Cache memory model micro-benchmarks Cache algorithms Cache coherence
Zdroj:	HPCS International Conference on High Performance Computing & Simulation (HPCS 2014) International Conference on High Performance Computing & Simulation (HPCS 2014), Jul 2014, Bologna, Italy. ⟨10.1109/HPCSim.2014.6903790⟩
DOI:	10.1109/hpcsim.2014.6903790
Popis:	International audience; The increasing computation capability of servers comes with a dramatic increase of their complexity through many cores, multiple levels of caches and NUMA architectures. Exploiting the computing power is increasingly harder and programmers need ways to understand the performance behavior. We present an innovative approach for predicting the performance of memory-bound multi-threaded applications. It relies on micro-benchmarks and a compositional model, combining measures of micro-benchmarks in order to model larger codes. Our memory model takes into account cache sizes and cache coherence protocols, having a large impact on performance of multi-threaded codes. Applying this model to real world HPC kernels shows that it can predict their performance with good accuracy, helping taking optimization decisions to increase application's performance.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::348274643a20ef253f9b1e3037181c4d https://doi.org/10.1109/hpcsim.2014.6903790 Zobrazit plný text záznamu