Zobrazeno 1 - 10
of 65
pro vyhledávání: '"Steven K. Reinhardt"'
Autor:
Prerak Patel, Gabriel Weisz, Kalin Ovtcharov, Lo Daniel, Doug Burger, Shlomi Alkalay, Stephen F. Heil, Adam Sapek, Michael Haselman, Steven K. Reinhardt, Todd Massengill, Jeremy Fowers, Eric S. Chung, Michael K. Papamichael, Adrian M. Caulfield, Logan Adams, Sitaram Lanka, Ming Liu, Lisa Woods, Mahdi Ghandi
Publikováno v:
IEEE Micro. 39:20-28
Growing computational demands from deep neural networks (DNNs), coupled with diminishing returns from general-purpose architectures, have led to a proliferation of Neural Processing Units (NPUs). This paper describes the Project Brainwave NPU (BW-NPU
Autor:
Friedel van Megen, Oren Firestein, Bita Darvish Rouhani, Mahdi Ghandi, Christian Boehn, Prerak Patel, Kara Kagi, Hari Angepat, Doug Burger, Brandon Perez, Raja Seera, Tamas Juhasz, Jeremy Fowers, Shlomi Alkalay, Logan Adams, Gabriel Weisz, Balaji Sridharan, Sangeetha Shekar, Kyle Holohan, Ritchie Zhao, Amanda Rapsang, Ahmad M. El Husseini, Adam Sapek, Todd Massengill, Kalin Ovtcharov, Sitaram Lanka, Dan Zhang, Michael K. Papamichael, Derek Chiou, Lo Daniel, Michael Haselman, Lisa Woods, Kang Su Gatlin, Maleen Abeydeera, Phillip Yi Xiao, Steven K. Reinhardt, Adrian M. Caulfield, Eric S. Chung, Alessandro Forin, Stephen F. Heil, Ratna Kumar Kovvuri, Dima Mukhortov, Ming Liu
Publikováno v:
IEEE Micro. 38:8-20
To meet the computational demands required of deep learning, cloud operators are turning toward specialized hardware for improved efficiency and performance. Project Brainwave, Microsofts principal infrastructure for AI serving in real time, accelera
Publikováno v:
IEEE Micro. 37:6-12
Hardware vendors face serious challenges as technology limitations have slowed the rate of evolutionary improvement in general-purpose processors. Staying on the current trend line requires significant expenditure, while transitioning to a disruptive
Autor:
Steven K. Reinhardt, Brad Benton, Michael LeBeane, Mauricio Breternitz, Khaled Hamidouche, Lizy K. John
Publikováno v:
PACT
Current state-of-the-art in GPU networking advocates a host-centric model that reduces performance and increases code complexity. Recently, researchers have explored several techniques for networking within a GPU kernel itself. These approaches, howe
Publikováno v:
International Journal of Parallel Programming. 45:657-679
Graph applications are common in scientific and enterprise computing. Recent research used graphics processing units (GPUs) to accelerate graph workloads. These applications tend to present characteristics that are challenging for SIMD execution. To
Autor:
Jan Vesely, Gabriel H. Loh, Mark Oskin, Arkaprava Basu, Steven K. Reinhardt, Abhishek Bhattacharjee
Publikováno v:
ISCA
GPUs are becoming first-class compute citizens and increasingly support programmability-enhancing features such as shared virtual memory and hardware cache coherence. This enables them to run a wider variety of programs. However, a key aspect of gene
Publikováno v:
SC
Distributed systems incorporate GPUs because they provide massive parallelism in an energy-efficient manner. Unfortunately, existing programming models make it difficult to route a GPU-initiated network message. The traditional coprocessor model forc
Autor:
Khaled Hamidouche, Michael LeBeane, Steven K. Reinhardt, Brad Benton, Lizy K. John, Mauricio Breternitz
Publikováno v:
SC
GPUs are widespread across clusters of compute nodes due to their attractive performance for data parallel codes. However, communicating between GPUs across the cluster is cumbersome when compared to CPU networking implementations. A number of recent
Autor:
Bradford M. Beckmann, Nuwan Jayasena, Indrani Paul, Steven K. Reinhardt, Gregory Rodgers, Mike Ignatowski, William C. Brantley, Sudhanva Gurumurthi, Gabriel H. Loh, Michael J. Schulte
Publikováno v:
IEEE Micro. 35:26-36
This article provides an overview of AMD's vision for exascale computing, and in particular, how heterogeneity will play a central role in realizing this vision. Exascale computing requires high levels of performance capabilities while staying within
Autor:
Steven K. Reinhardt, Mark D. Hill, Blake A. Hechtman, Bradford M. Beckmann, Derek R. Hower, Darien Wood, Benedict R. Gaster
Publikováno v:
ASPLOS
Commodity heterogeneous systems (e.g., integrated CPUs and GPUs), now support a unified, shared memory address space for all components. Because the latency of global communication in a heterogeneous system can be prohibi-tively high, heterogeneous s