Zobrazeno 1 - 10
of 91
pro vyhledávání: '"Evgeny Bolotin"'
Autor:
Evgeny Bolotin, Niladrish Chatterjee, Daniel Lustig, Nan Jiang, Yaosheng Fu, Oreste Villa, David Nellans, Zi Yan
Publikováno v:
HPCA
The demands of high-performance computing (HPC) and machine learning (ML) workloads have resulted in the rapid architectural evolution of GPUs over the last decade. The growing memory footprint and diversity of data types in these workloads has requi
Publikováno v:
HPCA
Prior work on GPU cache coherence has shown that simple hardware-or software-based protocols can be more than sufficient. However, in recent years, features such as multi-chip modules have added deeper hierarchy and non-uniformity into GPU memory sys
Publikováno v:
HPCA
Publikováno v:
MICRO
Historically, improvement in GPU performance has been tightly coupled with transistor scaling. As Moore's Law slows down, performance of single GPUs may ultimately plateau. To continue GPU performance scaling, multiple GPUs can be connected using sys
Publikováno v:
IEEE Micro. 35:60-68
Recent packaging technologies that enable DRAM chips to be stacked inside the processor package or on top of the processor chip can lower DRAM energy-per-bit costs, provide wider interfaces, and offer higher bandwidth. However, these technologies are
Autor:
Evgeny Bolotin, Oreste Villa, Eiman Ebrahimi, David Nellans, Aamer Jaleel, Akhil Arunkumar, Ugljesa Milic, Carole-Jean Wu, Benjamin Cho
Publikováno v:
ISCA
Historically, improvements in GPU-based high performance computing have been tightly coupled to transistor scaling. As Moore's law slows down, and the number of transistors per die no longer grows at historical rates, the performance curve of single
Autor:
Aamer Jaleel, Oreste Villa, Akhil Arunkumar, Ugljesa Milic, Alex Ramirez, Eiman Ebrahimi, Evgeny Bolotin, David Nellans
Publikováno v:
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
MICRO
Recercat. Dipósit de la Recerca de Catalunya
instname
Universitat Politècnica de Catalunya (UPC)
MICRO
Recercat. Dipósit de la Recerca de Catalunya
instname
GPUs achieve high throughput and power efficiency by employing many small single instruction multiple thread (SIMT) cores. To minimize scheduling logic and performance variance they utilize a uniform memory system and leverage strong data parallelism
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::9eb7db3e25c87a542b4146cf5f044686
Autor:
Stephen W. Keckler, Evgeny Bolotin, Joel Emer, Mike O'Connor, Niladrish Chatterjee, Aditya Agrawal
Publikováno v:
MEMSYS
With increasing DRAM densities, the performance and energy overheads of refresh operations are increasingly significant. When the system is active, refresh commands render DRAM banks unavailable for increasing periods of time. These refresh operation
Autor:
Evgeny Bolotin, Gennady Pekhimenko, Todd C. Mowry, Onur Mutlu, Stephen W. Keckler, Nandita Vijaykumar
Publikováno v:
HPCA
Data compression can be an effective method to achieve higher system performance and energy efficiency in modern data-intensive applications by exploiting redundancy and data similarity. Prior works have studied a variety of data compression techniqu
Publikováno v:
ACM Transactions on Architecture and Code Optimization
GPGPUs are optimized for graphics, for that reason the hardware is optimized for massively data parallel applications characterized by predictable memory access patterns and little control flow. For such applications' e.g., matrix multiplication, GPG