Zobrazeno 1 - 10
of 63
pro vyhledávání: '"Naigang Wang"'
Autor:
Jinwook Oh, Alyssa Herbert, Marcel Schaal, Zhibin Ren, Ching Zhou, Siyu Koswatta, Naigang Wang, Matthew Cohen, Vidhi Zalani, Howard M. Haynie, Matthew M. Ziegler, Sae Kyu Lee, Brian W. Curran, Monodeep Kar, Martin Lutz, Xin Zhang, Robert Casatuta, Vijayalakshmi Srinivasan, Nianzheng Cao, Sunil Shukla, Pong-Fei Lu, Leland Chang, Michael A. Guillorn, Bruce M. Fleischer, Michael R. Scheuermann, Joel Abraham Silberman, Kerstin Schelm, Vinay Velji Shah, Chia-Yu Chen, Kailash Gopalakrishnan, Swagath Venkataramani, Hung Tran, Mingu Kang, Wei Wang, Jungwook Choi, Scot H. Rider, Jinwook Jung, James J. Bonanno, Radhika Jain, Li Yulong, Xiao Sun, Silvia Melitta Mueller, Kyu-hyoun Kim, Ankur Agrawal
Publikováno v:
IEEE Journal of Solid-State Circuits. 57:182-197
Reduced precision computation is a key enabling factor for energy-efficient acceleration of deep learning (DL) applications. This article presents a 7-nm four-core mixed-precision artificial intelligence (AI) chip that supports four compute precision
Autor:
Michael J. Klaiber, George D. Gristede, Shih-Hsien Lo, Hiroshi Inoue, Leland Chang, Christos Vezyrtzis, Jungwook Choi, Gary W. Maier, Fanchieh Yee, Shubham Jain, Brian W. Curran, Jintao Zhang, Mingu Kang, Howard M. Haynie, Mauricio J. Serrano, Pong-Fei Lu, Silvia Melitta Mueller, Matthew M. Ziegler, Bruce M. Fleischer, Kazuaki Ishizaki, Kailash Gopalakrishnan, Michael R. Scheuermann, Ankur Agarwal, Xiao Sun, Sunil Shukla, Thomas W. Fox, Vijayalakshmi Srinivasan, Tina Babinsky, Swagath Venkataramani, Michael A. Guillorn, Ching Zhou, Nianzheng Cao, Eri Ogawa, Naigang Wang, Moriyoshi Ohara, Joel Abraham Silberman, Jinwook Oh, Marcel Schaal, Chia-Yu Chen, Wei Wang
Publikováno v:
Proceedings of the IEEE. 108:2232-2250
Advances in deep neural networks (DNNs) and the availability of massive real-world data have enabled superhuman levels of accuracy on many AI tasks and ushered the explosive growth of AI workloads across the spectrum of computing devices. However, th
Publikováno v:
Archives of Pharmacal Research. 42:1063-1070
Hesperetin, a major bioflavonoid in sweet oranges and lemons, exerts an anti-inflammatory effect in pulmonary diseases; however, its effect on lipopolysaccharide (LPS)-induced acute lung injury is unclear. This study investigated the effect of hesper
Autor:
Kailash Gopalakrishnan, Swagath Venkataramani, Wei Zhang, George Saon, Xiao Sun, Andrea Fasoli, Chia-Yu Chen, Xiaodong Cui, Mauricio J. Serrano, Zoltán Tüske, Naigang Wang, Brian Kingsbury
We investigate the impact of aggressive low-precision representations of weights and activations in two families of large LSTM-based architectures for Automatic Speech Recognition (ASR): hybrid Deep Bidirectional LSTM - Hidden Markov Models (DBLSTM-H
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::692d7369d4cfd471cc31430dde91dc57
http://arxiv.org/abs/2108.12074
http://arxiv.org/abs/2108.12074
Autor:
Hamza Ouarnoughi, Hadjer Benmeziane, Kaoutar El Maghraoui, Martin Wistuba, Naigang Wang, Smail Niar
Publikováno v:
Thirtieth International Joint Conference on Artificial Intelligence {IJCAI-21}
Thirtieth International Joint Conference on Artificial Intelligence, Aug 2021, Montreal, Canada. pp.4322-4329, ⟨10.24963/ijcai.2021/592⟩
IJCAI
Thirtieth International Joint Conference on Artificial Intelligence, Aug 2021, Montreal, Canada. pp.4322-4329, ⟨10.24963/ijcai.2021/592⟩
IJCAI
International audience; There is no doubt that making AI mainstream by bringing powerful, yet power hungry deep neural networks (DNNs) to resource-constrained devices would required an efficient co-design of algorithms, hardware and software. The inc
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::f181376173073a4b8c53f902c965bb73
https://hal-uphf.archives-ouvertes.fr/hal-03379694
https://hal-uphf.archives-ouvertes.fr/hal-03379694
Autor:
Scot H. Rider, Martin Lutz, Moriyoshi Ohara, Pong-Fei Lu, Monodeep Kar, Xiao Sun, Kailash Gopalakrishnan, Jie Yang, Hoang Tran, Wei Wang, Michael A. Guillorn, Marcel Schaal, Ankur Agrawal, Xin Zhang, Joel Abraham Silberman, Sunil Shukla, Nianzheng Cao, James Bonano, Zhibin Ren, Sanchari Sen, Siyu Koswatta, Kyu-hyoun Kim, Mingu Kang, Swagath Venkataramani, Eri Ogawa, Vijayalakshmi Srinivasan, Hiroshi Inoue, Matt Ziegler, Howard M. Haynie, Shubham Jain, Vinay Velji Shah, Allison Allain, Jintao Zhang, Matthew Cohen, Jungwook Choi, Kerstin Schelm, Jinwook Oh, Li Yulong, Chia-Yu Chen, Ching Zhou, Naigang Wang, Jinwook Jung, Sae Kyu Lee, Silvia Melitta Mueller, Kazuaki Ishizaki, Bruce M. Fleischer, Michael R. Scheuermann, Vidhi Zalani, Brian W. Curran, Leland Chang, Mauricio J. Serrano, Ashish Ranjan, Alberto Mannari, Robert Casatuta
Publikováno v:
ISCA
The growing prevalence and computational demands of Artificial Intelligence (AI) workloads has led to widespread use of hardware accelerators in their execution. Scaling the performance of AI accelerators across generations is pivotal to their succes
Autor:
Matthew M. Ziegler, Sunil Shukla, Gary W. Maier, Jinwook Oh, Kailash Gopalakrishnan, Christos Vezyrtzis, Thomas W. Fox, Michael J. Klaiber, Howard M. Haynie, Swagath Venkataramani, Leland Chang, Jungwook Choi, Nianzheng Cao, Pong-Fei Lu, Pierce Chuang, Michael A. Guillorn, Brian W. Curran, Dongsoo Lee, Fanchieh Yee, Ankur Agrawal, Ching Zhou, Silvia Melitta Mueller, Naigang Wang, George D. Gristede, Bruce M. Fleischer, Michael R. Scheuermann, Tina Babinsky, Vijayalakshmi Srinivasan, Chia-Yu Chen, Joel Abraham Silberman, Shih-Hsien Lo
Publikováno v:
IEEE Solid-State Circuits Letters. 1:217-220
This letter presents a multi-TOPS AI accelerator core for deep learning training and inference. With a programmable architecture and custom ISA, this engine achieves >90% sustained utilization across the range of neural network topologies by employin
Autor:
Xin Zhang, Vijayalakshmi Srinivasan, Wei Wang, Jungwook Choi, Siyu Koswatta, Mingu Kang, Li Yulong, Bruce M. Fleischer, Radhika Jain, Michael R. Scheuermann, Kerstin Schelm, Kailash Gopalakrishnan, Monodeep Kar, Zhibin Ren, Michael A. Guillorn, Swagath Venkataramani, Howard M. Haynie, Xiao Sun, Matthew M. Ziegler, Hung Tran, Sae Kyu Lee, Kyu-hyoun Kim, Joel Abraham Silberman, Martin Lutz, Silvia Melitta Mueller, Sunil Shukla, Pong-Fei Lu, Vidhi Zalani, Ching Zhou, Brian W. Curran, Vinay Velji Shah, Naigang Wang, Leland Chang, Robert Casatuta, Alyssa Herbert, Nianzheng Cao, Scot H. Rider, Marcel Schaal, Ankur Agrawal, Jinwook Oh, Jinwook Jung, James J. Bonanno, Matthew Cohen, Chia-Yu Chen
Publikováno v:
ISSCC
Low-precision computation is the key enabling factor to achieve high compute densities (T0PS/W and T0PS/mm2) in AI hardware accelerators across cloud and edge platforms. However, robust deep learning (DL) model accuracy equivalent to high-precision c
Autor:
Gary W. Maier, Wei Wang, Siyu Koswatta, Vijayalakshmi Srinivasan, Howard M. Haynie, George D. Gristede, Bruce M. Fleischer, Michael R. Scheuermann, Matthew M. Ziegler, Sunil Shukla, Jinwook Oh, Vicktoria Ivanov, Kailash Gopalakrishnan, Martin Lutz, Ching Zhou, Xiao Sun, Silvia Melitta Mueller, Brian W. Curran, Pong-Fei Lu, Thomas W. Fox, Swagath Venkataramani, Nianzheng Cao, Ankur Agrawal, Robert Casatuta, Naigang Wang, Jungwook Choi, Vinay Velji Shah, Alex Mesh, Marcel Schaal, Scot H. Rider, Fanchieh Yee, Joel Abraham Silberman, James J. Bonanno, Michael A. Guillorn, Mingu Kang, Sae Kyu Lee, Shimon Ben-Yehuda, Erez Ophir, Chia-Yu Chen, Matthew Cohen, Yevgeny Nustov, Leland Chang, Shih-Hsien Lo
Publikováno v:
VLSI Circuits
A processor core is presented for AI training and inference products. Leading-edge compute efficiency is achieved for robust fp16 training via efficient heterogeneous 2-D systolic array-SIMD compute engines leveraging compact DLFloat16 FPUs. Architec
Autor:
Silvia Melitta Mueller, Naigang Wang, Kailash Gopalakrishnan, Ankur Agrawal, Bruce M. Fleischer, Xiao Sun, Jungwook Choi
Publikováno v:
ARITH
The resilience of Deep Learning (DL) training and inference workloads to low-precision computations, coupled with the demand for power-and area-efficient hardware accelerators for these workloads, has led to the emergence of 16-bit floating point for