Latency tolerance for throughput computing

Autor:	Chien-Ping Lu, Brian Ko
Rok vydání:	2012
Předmět:	Computer science Register file Operating system Workload Thread (computing) Parallel computing Latency (engineering) computer.software_genre computer
Zdroj:	Proceedings of the International Conference on Computer-Aided Design.
DOI:	10.1145/2429384.2429496
Popis:	In Throughput Computing, the data can be processed independently with a substantial amount of threads running similar programs, referred to as kernels, or shaders for graphics specific workload. A Throughput Computing device, such as GPU, requires task latency tolerance to hold the context of the outstanding threads, and data latency tolerance to hold spaces for memory requests issued from the threads. The threads are grouped into thread groups. The register file and the associated number of outstanding thread groups should be sized according to the ratio of the computing resources to load/store units. Such a ratio should reflect the balance between ALU and load/store instructions of target workload.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::0cc0b6fe13f8bd31654cf6aeaf1d27a8 https://doi.org/10.1145/2429384.2429496 Zobrazit plný text záznamu