Zobrazeno 1 - 10
of 32
pro vyhledávání: '"Michael R. Scheuermann"'
Autor:
Jinwook Oh, Alyssa Herbert, Marcel Schaal, Zhibin Ren, Ching Zhou, Siyu Koswatta, Naigang Wang, Matthew Cohen, Vidhi Zalani, Howard M. Haynie, Matthew M. Ziegler, Sae Kyu Lee, Brian W. Curran, Monodeep Kar, Martin Lutz, Xin Zhang, Robert Casatuta, Vijayalakshmi Srinivasan, Nianzheng Cao, Sunil Shukla, Pong-Fei Lu, Leland Chang, Michael A. Guillorn, Bruce M. Fleischer, Michael R. Scheuermann, Joel Abraham Silberman, Kerstin Schelm, Vinay Velji Shah, Chia-Yu Chen, Kailash Gopalakrishnan, Swagath Venkataramani, Hung Tran, Mingu Kang, Wei Wang, Jungwook Choi, Scot H. Rider, Jinwook Jung, James J. Bonanno, Radhika Jain, Li Yulong, Xiao Sun, Silvia Melitta Mueller, Kyu-hyoun Kim, Ankur Agrawal
Publikováno v:
IEEE Journal of Solid-State Circuits. 57:182-197
Reduced precision computation is a key enabling factor for energy-efficient acceleration of deep learning (DL) applications. This article presents a 7-nm four-core mixed-precision artificial intelligence (AI) chip that supports four compute precision
Autor:
Michael J. Klaiber, George D. Gristede, Shih-Hsien Lo, Hiroshi Inoue, Leland Chang, Christos Vezyrtzis, Jungwook Choi, Gary W. Maier, Fanchieh Yee, Shubham Jain, Brian W. Curran, Jintao Zhang, Mingu Kang, Howard M. Haynie, Mauricio J. Serrano, Pong-Fei Lu, Silvia Melitta Mueller, Matthew M. Ziegler, Bruce M. Fleischer, Kazuaki Ishizaki, Kailash Gopalakrishnan, Michael R. Scheuermann, Ankur Agarwal, Xiao Sun, Sunil Shukla, Thomas W. Fox, Vijayalakshmi Srinivasan, Tina Babinsky, Swagath Venkataramani, Michael A. Guillorn, Ching Zhou, Nianzheng Cao, Eri Ogawa, Naigang Wang, Moriyoshi Ohara, Joel Abraham Silberman, Jinwook Oh, Marcel Schaal, Chia-Yu Chen, Wei Wang
Publikováno v:
Proceedings of the IEEE. 108:2232-2250
Advances in deep neural networks (DNNs) and the availability of massive real-world data have enabled superhuman levels of accuracy on many AI tasks and ushered the explosive growth of AI workloads across the spectrum of computing devices. However, th
Autor:
Scot H. Rider, Martin Lutz, Moriyoshi Ohara, Pong-Fei Lu, Monodeep Kar, Xiao Sun, Kailash Gopalakrishnan, Jie Yang, Hoang Tran, Wei Wang, Michael A. Guillorn, Marcel Schaal, Ankur Agrawal, Xin Zhang, Joel Abraham Silberman, Sunil Shukla, Nianzheng Cao, James Bonano, Zhibin Ren, Sanchari Sen, Siyu Koswatta, Kyu-hyoun Kim, Mingu Kang, Swagath Venkataramani, Eri Ogawa, Vijayalakshmi Srinivasan, Hiroshi Inoue, Matt Ziegler, Howard M. Haynie, Shubham Jain, Vinay Velji Shah, Allison Allain, Jintao Zhang, Matthew Cohen, Jungwook Choi, Kerstin Schelm, Jinwook Oh, Li Yulong, Chia-Yu Chen, Ching Zhou, Naigang Wang, Jinwook Jung, Sae Kyu Lee, Silvia Melitta Mueller, Kazuaki Ishizaki, Bruce M. Fleischer, Michael R. Scheuermann, Vidhi Zalani, Brian W. Curran, Leland Chang, Mauricio J. Serrano, Ashish Ranjan, Alberto Mannari, Robert Casatuta
Publikováno v:
ISCA
The growing prevalence and computational demands of Artificial Intelligence (AI) workloads has led to widespread use of hardware accelerators in their execution. Scaling the performance of AI accelerators across generations is pivotal to their succes
Autor:
Matthew M. Ziegler, Sunil Shukla, Gary W. Maier, Jinwook Oh, Kailash Gopalakrishnan, Christos Vezyrtzis, Thomas W. Fox, Michael J. Klaiber, Howard M. Haynie, Swagath Venkataramani, Leland Chang, Jungwook Choi, Nianzheng Cao, Pong-Fei Lu, Pierce Chuang, Michael A. Guillorn, Brian W. Curran, Dongsoo Lee, Fanchieh Yee, Ankur Agrawal, Ching Zhou, Silvia Melitta Mueller, Naigang Wang, George D. Gristede, Bruce M. Fleischer, Michael R. Scheuermann, Tina Babinsky, Vijayalakshmi Srinivasan, Chia-Yu Chen, Joel Abraham Silberman, Shih-Hsien Lo
Publikováno v:
IEEE Solid-State Circuits Letters. 1:217-220
This letter presents a multi-TOPS AI accelerator core for deep learning training and inference. With a programmable architecture and custom ISA, this engine achieves >90% sustained utilization across the range of neural network topologies by employin
Autor:
Xin Zhang, Vijayalakshmi Srinivasan, Wei Wang, Jungwook Choi, Siyu Koswatta, Mingu Kang, Li Yulong, Bruce M. Fleischer, Radhika Jain, Michael R. Scheuermann, Kerstin Schelm, Kailash Gopalakrishnan, Monodeep Kar, Zhibin Ren, Michael A. Guillorn, Swagath Venkataramani, Howard M. Haynie, Xiao Sun, Matthew M. Ziegler, Hung Tran, Sae Kyu Lee, Kyu-hyoun Kim, Joel Abraham Silberman, Martin Lutz, Silvia Melitta Mueller, Sunil Shukla, Pong-Fei Lu, Vidhi Zalani, Ching Zhou, Brian W. Curran, Vinay Velji Shah, Naigang Wang, Leland Chang, Robert Casatuta, Alyssa Herbert, Nianzheng Cao, Scot H. Rider, Marcel Schaal, Ankur Agrawal, Jinwook Oh, Jinwook Jung, James J. Bonanno, Matthew Cohen, Chia-Yu Chen
Publikováno v:
ISSCC
Low-precision computation is the key enabling factor to achieve high compute densities (T0PS/W and T0PS/mm2) in AI hardware accelerators across cloud and edge platforms. However, robust deep learning (DL) model accuracy equivalent to high-precision c
Autor:
Gary W. Maier, Wei Wang, Siyu Koswatta, Vijayalakshmi Srinivasan, Howard M. Haynie, George D. Gristede, Bruce M. Fleischer, Michael R. Scheuermann, Matthew M. Ziegler, Sunil Shukla, Jinwook Oh, Vicktoria Ivanov, Kailash Gopalakrishnan, Martin Lutz, Ching Zhou, Xiao Sun, Silvia Melitta Mueller, Brian W. Curran, Pong-Fei Lu, Thomas W. Fox, Swagath Venkataramani, Nianzheng Cao, Ankur Agrawal, Robert Casatuta, Naigang Wang, Jungwook Choi, Vinay Velji Shah, Alex Mesh, Marcel Schaal, Scot H. Rider, Fanchieh Yee, Joel Abraham Silberman, James J. Bonanno, Michael A. Guillorn, Mingu Kang, Sae Kyu Lee, Shimon Ben-Yehuda, Erez Ophir, Chia-Yu Chen, Matthew Cohen, Yevgeny Nustov, Leland Chang, Shih-Hsien Lo
Publikováno v:
VLSI Circuits
A processor core is presented for AI training and inference products. Leading-edge compute efficiency is achieved for robust fp16 training via efficient heterogeneous 2-D systolic array-SIMD compute engines leveraging compact DLFloat16 FPUs. Architec
Autor:
Jungwook Choi, Ching Zhou, Naigang Wang, Ankur Agrawal, Michael J. Klaiber, Matthew M. Ziegler, Fanchieh Yee, Shih-Hsien Lo, Sunil Shukla, George D. Gristede, Bruce M. Fleischer, Michael R. Scheuermann, Chia-Yu Chen, Michael A. Guillorn, Kailash Gopalakrishnan, Joel Abraham Silberman, Jinwook Oh, Howard M. Haynie, Thomas W. Fox, Vijayalakshmi Srinivasan, Brian W. Curran, Gary W. Maier, Swagath Venkataramani, Nianzheng Cao, Pong-Fei Lu, Christos Vezyrtzis, Tina Babinsky, Silvia Melitta Mueller, Pierce Chuang, Leland Chang, Dongsoo Lee
Publikováno v:
ISLPED
The combination of growth in compute capabilities and availability of large datasets has led to a re-birth of deep learning. Deep Neural Networks (DNNs) have become state-of-the-art in a variety of machine learning tasks spanning domains across visio
Autor:
Shih-Hsien Lo, Brian W. Curran, Jinwook Oh, Howard M. Haynie, Vijavalakshmi Srinivasan, Lel Chang, Fanchieh Yee, Tina Babinsky, Joel Abraham Silberman, George D. Gristede, Matthew M. Ziegler, Gary W. Maier, Bruce M. Fleischer, Michael R. Scheuermann, Nianzheng Cao, Ankur Agrawal, Ching Zhou, Chia-Yu Chen, Silvia Melitta Mueller, Jungwook Choi, Naigang Wang, Kailash Gopalakrishnan, Thomas W. Fox, Sunil Shukla, Swagath Venkataramani, Michael J. Klaiber, Christos Vezyrtzis, Pierce Chuang, Dongsoo Lee, Michael A. Guillorn, Pong-Fei Lu
Publikováno v:
VLSI Circuits
A multi-TOPS AI core is presented for acceleration of deep learning training and inference in systems from edge devices to data centers. With a programmable architecture and custom ISA, this engine achieves >90% sustained utilization across the range
Publikováno v:
IEEE Transactions on Components, Packaging and Manufacturing Technology. 4:123-133
3-D integration using through-silicon-vias (TSVs) is emerging as one of the key technology options for continued miniaturization. However, because of increased device and current density, the reliability of the 3-D power grid and its integrity must b
Autor:
Christian Bergeron, Michael R. Scheuermann, R. P. Robertazzi, M. Wordeman, S. Tian, Christy S. Tyberg, Joel Abraham Silberman, H. Jacobson, Phillip J. Restle
Publikováno v:
3DIC
3D chip stacking technology has the potential to enable increased system performance through integration of heterogeneous system components, such as accelerators and high density memory, as well as through increased area for tightly integrated proces