LRMP: Layer Replication with Mixed Precision for spatial in-memory DNN accelerators

Autor:	Abinand Nallathambi, Christin David Bose, Wilfried Haensch, Anand Raghunathan
Jazyk:	angličtina
Rok vydání:	2024
Předmět:	in-memory computing analog accelerator quantization reinforcement learning mixed integer linear programming Electronic computers. Computer science QA75.5-76.95
Zdroj:	Frontiers in Artificial Intelligence, Vol 7 (2024)
Druh dokumentu:	article
ISSN:	2624-8212
DOI:	10.3389/frai.2024.1268317
Popis:	In-memory computing (IMC) with non-volatile memories (NVMs) has emerged as a promising approach to address the rapidly growing computational demands of Deep Neural Networks (DNNs). Mapping DNN layers spatially onto NVM-based IMC accelerators achieves high degrees of parallelism. However, two challenges that arise in this approach are the highly non-uniform distribution of layer processing times and high area requirements. We propose LRMP, a method to jointly apply layer replication and mixed precision quantization to improve the performance of DNNs when mapped to area-constrained IMC accelerators. LRMP uses a combination of reinforcement learning and mixed integer linear programming to search the replication-quantization design space using a model that is closely informed by the target hardware architecture. Across five DNN benchmarks, LRMP achieves 2.6–9.3× latency and 8–18× throughput improvement at minimal (
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/c7e96859d51e449fb0a70648eeac127d Zobrazit plný text záznamu View record in DOAJ