HAWS

Autor:	Xun Gong, Xiang Gong, David Kaeli, Leiming Yu
Rok vydání:	2019
Předmět:	Hardware_MEMORYSTRUCTURES Out-of-order execution Memory hierarchy business.industry Computer science Preemption Thread (computing) computer.software_genre Supercomputer CAS latency Hardware and Architecture Embedded system Compiler Graphics business computer Software Information Systems
Zdroj:	ACM Transactions on Architecture and Code Optimization. 16:1-22
ISSN:	1544-3973 1544-3566
DOI:	10.1145/3291050
Popis:	Graphics Processing Units (GPUs) have become an attractive platform for accelerating challenging applications on a range of platforms, from High Performance Computing (HPC) to full-featured smartphones. They can overcome computational barriers in a wide range of data-parallel kernels. GPUs hide pipeline stalls and memory latency by utilizing efficient thread preemption. But given the demands on the memory hierarchy due to the growth in the number of computing cores on-chip, it has become increasingly difficult to hide all of these stalls. In this article, we propose a novel Hint-Assisted Wavefront Scheduler (HAWS) to bypass long-latency stalls. HAWS starts by enhancing a compiler infrastructure to identify potential opportunities that can bypass memory stalls. HAWS includes a wavefront scheduler that can continue to execute instructions in the shadow of a memory stall, executing instructions speculatively, guided by compiler-generated hints. HAWS increases utilization of GPU resources by aggressively fetching/executing speculatively. Based on our simulation results on the AMD Southern Islands GPU architecture, at an estimated cost of 0.4% total chip area, HAWS can improve application performance by 14.6% on average for memory intensive applications.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::8ec6c2df40ff1eb75aff1d4a6a8b9ff0 https://doi.org/10.1145/3291050 Zobrazit plný text záznamu