Zobrazeno 1 - 9
of 9
pro vyhledávání: '"Yasuko Eckert"'
Publikováno v:
NOCS
The performance of graphics processing units (GPU) workloads can be sensitive to the various clock domains which are dynamically tunable in modern GPUs. In this work, we observe that GPU application performance is sensitive towards NoC clock frequenc
Publikováno v:
PACT
Graphics Processing Units (GPUs) concurrently execute thousands of threads, which makes them effective for achieving high throughput for a wide range of applications. However, the memory wall often limits peak throughput. GPUs use caches to address t
Autor:
Yasuko Eckert, Patel Chintan S, Smith Alan Dodson, Natalie Enright Jerger, Morton Eric Christopher, Jieming Yin, Mark Oskin, Gabriel H. Loh, Sethumurugan Subhash
Publikováno v:
HPCA
There has been a lot of recent interest in applying machine learning (ML) to the design of systems, which purports to aid human experts in extracting new insights leading to better systems. In this work, we share our experiences with applying ML to i
Autor:
Gabriel H. Loh, Lifeng Nai, Ramyad Hadidi, Hyesoon Kim, Onur Kayiran, Nuwan Jayasena, Hyojong Kim, Yasuko Eckert
Publikováno v:
ACM Transactions on Architecture and Code Optimization. 15:1-23
To exploit parallelism and scalability of multiple GPUs in a system, it is critical to place compute and data together. However, two key techniques that have been used to hide memory latency and improve thread-level parallelism (TLP), memory interlea
Publikováno v:
ICS
Near-memory processing or processing-in-memory (PIM) is regaining a lot of interest recently as a viable solution to overcome the challenges imposed by memory wall. This trend has been mainly fueled by the emergence of 3D-stacked memories. GPUs are t
Publikováno v:
MEMSYS
Silicon interposer technology is promising for large-scale integration of memory within a processor package. While past work on vertical, 3D-stacked memory allows a stack of memory to be placed directly on top of a processor, the total amount of memo
Publikováno v:
SIGMETRICS
Idle power is a significant contributor to overall energy consumption in modern multi-core processors. Cores can enter a full-sleep state, also known as C6, to reduce idle power; however, entering C6 incurs performance and power overheads. Since powe
Publikováno v:
HPCA
The steadily increasing sizes of main memory capacities require corresponding increases in the processor's translation lookaside buffer (TLB) resources to avoid performance bottlenecks. Large operating system page sizes can mitigate the bottleneck wi
Publikováno v:
ISLPED
The limited utility of voltage scaling in nano-scale technologies has led high-performance processors to rely increasingly on frequency scaling for power management. However, frequency scaling provides only a linear dynamic power reduction.In this pa