Výsledky vyhledávání

Accelerating Chip Design With Machine Learning

Autor: Steve Dai, Ben Keller, William J. Dally, Brucek Khailany, Rangharajan Venkatesan, Alicia Klinefelter, Robert M. Kirby, Saad Godil, Yanqing Zhang, Haoxing Ren, Bryan Catanzaro

Publikováno v: IEEE Micro. 40:23-32

Recent advancements in machine learning provide an opportunity to transform chip design workflows. We review recent research applying techniques such as deep convolutional neural networks and graph-based neural networks in the areas of automatic desi

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::8f7f1c5882e4345716cd5f17141ae3f1
https://doi.org/10.1109/mm.2020.3026231

Zobrazit plný text záznamu

Evaluating Celerity: A 16-nm 695 Giga-RISC-V Instructions/s Manycore Processor With Synthesizable PLL

Publikováno v: IEEE Solid-State Circuits Letters. 2:289-292

This letter presents a 16-nm 496-core RISC-V network-on-chip (NoC). The mesh achieves 1.4 GHz at 0.98 V, yielding a peak throughput of 695 Giga RISC-V instructions/s (GRVIS), a peak energy efficiency of 314.89 GRVIS/W, and a record 825 320 CoreMark b

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::e7d346eb2746cc7cd49ead584d74bb92
https://doi.org/10.1109/lssc.2019.2953847

Zobrazit plný text záznamu

Softermax: Hardware/Software Co-Design of an Efficient Softmax for Transformers

Autor: Steve Dai, Rangharajan Venkatesan, Anand Raghunathan, Jacob R. Stevens, Brucek Khailany

Publikováno v: DAC

Transformers have transformed the field of natural language processing. This performance is largely attributed to the use of stacked self-attention layers, each of which consists of matrix multiplies as well as softmax operations. As a result, unlike

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::a43f355bec7dd1e3998c35892316aab8
http://arxiv.org/abs/2103.09301

Zobrazit plný text záznamu

Clockwork: Resource-Efficient Static Scheduling for Multi-Rate Image Processing Applications on FPGAs

Autor: Dillon Huff, Pat Hanrahan, Steve Dai

Publikováno v: FPGA
FCCM

Image processing applications can benefit tremendously from FPGA acceleration. However, hardware accelerators for these applications look very different from the programs that image processing algorithm designers are accustomed to writing. As a resul

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::b4285fe806f83465416c12b37d2a9e8e
https://doi.org/10.1145/3431920.3439457

Zobrazit plný text záznamu

LNS-Madam: Low-Precision Training in Logarithmic Number System using Multiplicative Weight Update

Autor: Jiawei Zhao, Steve Dai, Rangharajan Venkatesan, Brian Zimmer, Mustafa Ali, Ming-Yu Liu, Brucek Khailany, William J. Dally, Anima Anandkumar

Representing deep neural networks (DNNs) in low-precision is a promising approach to enable efficient acceleration and memory reduction. Previous methods that train DNNs in low-precision typically keep a copy of weights in high-precision during the w

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::96561221c389ff836eb77e2054e9f381

Zobrazit plný text záznamu

MAGNet: A Modular Accelerator Generator for Neural Networks

Autor: Miaorong Wang, Nathaniel Pinckney, Brucek Khailany, Alicia Klinefelter, Rangharajan Venkatesan, Ben Keller, Jason Clemons, William J. Dally, Matthew Fojtik, Stephen W. Keckler, Brian Zimmer, Yakun Sophia Shao, Joel Emer, Priyanka Raina, Yanqing Zhang, Steve Dai

Publikováno v: ICCAD

Deep neural networks have been adopted in a wide range of application domains, leading to high demand for inference accelerators. However, the high cost associated with ASIC hardware design makes it challenging to build custom accelerators for differ

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::da46915c96df472e2e315ebb43e706f3
https://doi.org/10.1109/iccad45719.2019.8942127

Zobrazit plný text záznamu

The Celerity Open-Source 511-Core RISC-V Tiered Accelerator Fabric: Fast Architectures and Design Methodologies for Fast Chips

Publikováno v: IEEE Micro. 38:30-41

Rapidly emerging workloads require rapidly developed chips. The Celerity 16-nm open-source SoC was implemented in nine months using an architectural trifecta to minimize development time: a general-purpose tier comprised of open-source Linux-capable

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::70b46279a5a90d57939fd8cc6293be07
https://doi.org/10.1109/mm.2018.022071133

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání