Performance Analysis of GPU Programming Models Using the Roofline Scaling Trajectories

Autor: Ibrahim, KZ, Williams, S, Oliker, L
Přispěvatelé: Gao, Wanling, Zhan, Jianfeng, Fox, Geoffrey C, Lu, Xiaoyi, Stanzione, Dan
Rok vydání: 2020
Předmět:
Popis: Performance analysis is a daunting job, especially for the rapid-evolving accelerator technologies. The Roofline Scaling Trajectories technique aims at diagnosing various performance bottlenecks for GPU programming models through the visually intuitive Roofline plots. In this work, we introduce the use of the Roofline Scaling Trajectories to capture major performance bottlenecks on NVIDIA Volta GPU architectures, such as warp efficiency, occupancy, and locality. Using this analysis technique, we explain the performance characteristics of the NAS Parallel Benchmarks (NPB) written with two programming models, CUDAand OpenACC. We present the influence of the programming model on the performance and scaling characteristics. We also leverage the insights of the Roofline Scaling Trajectory analysis to tune some of the NAS Parallel Benchmarks, achieving upto 2$$\times $$ speedup.
Databáze: OpenAIRE