Performance Analytics for Computational Experiments (PACE)

Autor: Sreepathi, Sarat, Mitchell, Zachary, Gaurab KC
Rok vydání: 2019
DOI: 10.6084/m9.figshare.7763552
Popis: The Energy Exascale Earth System Model (E3SM) is a high-resolution coupled Earth system model, designed to address energy-related science challenges of national interest while effectively using Department of Energy (DOE) supercomputers. This work presents PACE (Performance Analytics for Computational Experiments), a framework to summarize performance data collected from E3SM experiments to derive insights and present them through a web portal. PACE is designed to help identify bottlenecks and targets for performance engineering and optimization. E3SM incorporates a default lightweight performance profiling capability that is based on the General Purpose Timing Library (GPTL). PACE ingests the performance data from a completed experiment to facilitate interactive performance exploration including deep-dive into performance of different parallel processes and threads. Furthermore, it enables multi-experiment comparisons including scalability analysis for well-defined problem configurations. PACE uses MariaDB database to store structured and unstructured experiments outputs; various tools in the Python ecosystem for the backend infrastructure and middleware; and JavaScript tools for frontend and visual analytics. PACE enabled climate scientists to view executive summary of E3SM experiments and interactively deep-dive as desired. PACE is designed to be generic with reusable components to facilitate performance data collection and analysis for diverse applications.
Databáze: OpenAIRE