Towards Runtime Analytics in a Parallel Performance System

Autor: Kevin Huck, Sameer Shendey, Allen D. Malony, Chad Wood, Srinivasan Ramesh
Rok vydání: 2019
Zdroj: HPCS
Popis: Developers of scientific simulations use parallel performance systems to measure, analyze, and tune their applications on large-scale HPC machines. In the majority of these performance systems, the analysis takes place offline. More consequentially, if runtime analytics are desired, performance measurement infrastructures need to be designed and implemented in such a way to make it possible. We investigate the question of how to create runtime analytics capabilities by considering this objective in a reference platform – the TAU Performance System. Our research work identifies general issues of concern and describes how these can be addressed in a new TAUbased analytics framework. Several case studies are proposed as different analytics examples. These are prototyped, evaluated on HPC machines, and discussed. The outcomes of the research study suggest that runtime analytics has merit. Furthermore, we believe the approach could directly carry forward to other parallel performance systems.
Databáze: OpenAIRE