On Optimizing Complex Stencils on GPUs

Autor:	Aravind Sukumaran-Rajam, P. Sadayappan, Atanas Rountev, Louis-Noël Pouchet, Prashant Singh Rawat, Miheer Vaidya
Rok vydání:	2019
Předmět:	010302 applied physics Profiling (computer programming) Stencil code Computer science 020207 software engineering 02 engineering and technology Parallel computing Software_PROGRAMMINGTECHNIQUES Program optimization 01 natural sciences Stencil Bottleneck CUDA Kernel (image processing) 0103 physical sciences 0202 electrical engineering electronic engineering information engineering Code generation General-purpose computing on graphics processing units ComputingMethodologies_COMPUTERGRAPHICS
Zdroj:	IPDPS
DOI:	10.1109/ipdps.2019.00073
Popis:	Stencil computations are often the compute-intensive kernel in many scientific applications. With the increasing demand for computational accuracy, and the emergence of massively data-parallel high-bandwidth architectures like GPUs, stencils have steadily become more complex in terms of the stencil order, data accesses, and reuse patterns. Many prior efforts have focused on optimizing simpler stencil computations on various platforms. However, existing stencil code generators face challenges in optimizing such complex multi-statement stencil DAGs. This paper addresses the challenges in optimizing high-order stencil DAGs on GPUs by focusing on two key considerations: (1) enabling the domain expert to guide the code optimization, which may otherwise be extremely challenging for complex stencils; and (2) using bottleneck analysis via runtime profiling to guide the application of optimizations, and the tuning of various code generation parameters. We implement these abstractions in a prototype code generation framework termed Artemis, and evaluate its efficacy over multiple stencil kernels with varying complexity and operational intensity on an NVIDIA P100 GPU.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::61883daaa84a71ee9eb5b68a44a3ca48 https://doi.org/10.1109/ipdps.2019.00073 Zobrazit plný text záznamu