Sparsity-Aware Caches to Accelerate Deep Neural Networks

Autor:	Sanchari Sen, Vinod Ganesan, Kamakoti Veezhinathan, Anand Raghunathan, Pratyush Kumar, Neel Gala
Rok vydání:	2020
Předmět:	Reduction (complexity) Memory hierarchy Computer science Cache-only memory architecture Overhead (computing) Parallel computing Enhanced Data Rates for GSM Evolution Cache Content-addressable memory
Zdroj:	DATE
DOI:	10.23919/date48585.2020.9116511
Popis:	Deep Neural Networks (DNNs) have transformed the field of artificial intelligence and represent the state-of-the-art in many machine learning tasks. There is considerable interest in using DNNs to realize edge intelligence in highly resource-constrained devices such as wearables and IoT sensors. Unfortunately, the high computational requirements of DNNs pose a serious challenge to their deployment in these systems. Moreover, due to tight cost (and hence, area) constraints, these devices are often unable to accommodate hardware accelerators, requiring DNNs to execute on the General Purpose Processor (GPP) cores that they contain. We address this challenge through lightweight micro-architectural extensions to the memory hierarchy of GPPs that exploit a key attribute of DNNs, viz. sparsity, or the prevalence of zero values. We propose SparseCache, an enhanced cache architecture that utilizes a null cache based on a Ternary Content Addressable Memory (TCAM) to compactly store zero-valued cache lines, while storing non-zero lines in a conventional data cache. By storing address rather than values for zero-valued cache lines, SparseCache increases the effective cache capacity, thereby reducing the overall miss rate and execution time. SparseCache utilizes a Zero Detector and Approximator (ZDA) and Address Merger (AM) to perform reads and writes to the null cache. We evaluate SparseCache on four state-of-the-art DNNs programmed with the Caffe framework. SparseCache achieves 5-28% reduction in miss-rate, which translates to 5-21% reduction in execution time, with only 0.1% area and 3.8% power overhead in comparison to a low-end Intel Atom Z-series processor.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::8417054a4d189388bcf5ba6be4773db2 https://doi.org/10.23919/date48585.2020.9116511 Zobrazit plný text záznamu