A Real-Time Architecture for Pruning the Effectual Computations in Deep Neural Networks

Autor:	Hyuk-Jae Lee, Lakshminarayanan Gopalakrishnan, Mohammadreza Asadikouhanjani, Hao Zhang, Seok-Bum Ko
Rok vydání:	2021
Předmět:	Speedup Artificial neural network Computer science Dataflow Reference design 020208 electrical & electronic engineering Sorting 02 engineering and technology Parallel computing Filter (video) 0202 electrical engineering electronic engineering information engineering Benchmark (computing) Pruning (decision trees) Electrical and Electronic Engineering
Zdroj:	IEEE Transactions on Circuits and Systems I: Regular Papers. 68:2030-2041
ISSN:	1558-0806 1549-8328
DOI:	10.1109/tcsi.2021.3060945
Popis:	Integrating Deep Neural Networks (DNNs) into the Internet of Thing (IoT) devices could result in the emergence of complex sensing and recognition tasks that support a new era of human interactions with surrounding environments. However, DNNs are power-hungry, performing billions of computations in terms of one inference. Spatial DNN accelerators in principle can support computation-pruning techniques compared to other common architectures such as systolic arrays. Energy-efficient DNN accelerators skip bit-wise or word-wise sparsity in the input feature maps (ifmaps) and filter weights which means ineffectual computations are skipped. However, there is still room for pruning the effectual computations without reducing the accuracy of DNNs. In this paper, we propose a novel real-time architecture and dataflow by decomposing multiplications down to the bit level and pruning identical computations in spatial designs while running benchmark networks. The proposed architecture prunes identical computations by identifying identical bit values available in both ifmaps and filter weights without changing the accuracy of benchmark networks. When compared to the reference design, our proposed design achieves an average per layer speedup of $\times 1.4$ and an energy efficiency of $\times 1.21$ per inference while maintaining the accuracy of benchmark networks.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::87b38669e40758512fc4deed56ec8d0f https://doi.org/10.1109/tcsi.2021.3060945 Zobrazit plný text záznamu