Generating Efficient FPGA-based CNN Accelerators from High-Level Descriptions

Autor: Nermine Ali, Jean-Marc Philippe, Benoit Tain, Philippe Coussy
Přispěvatelé: Laboratoire Intelligence Artificielle Embarquée (LIAE), Université Paris-Saclay-Département Systèmes et Circuits Intégrés Numériques (DSCIN), Laboratoire d'Intégration des Systèmes et des Technologies (LIST (CEA)), Direction de Recherche Technologique (CEA) (DRT (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Direction de Recherche Technologique (CEA) (DRT (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Laboratoire d'Intégration des Systèmes et des Technologies (LIST (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA), Laboratoire Environnement de Conception & Architecture (LECA), Université de Bretagne Sud - Lorient (UBS Lorient), Université de Bretagne Sud (UBS)
Jazyk: angličtina
Rok vydání: 2022
Předmět:
Zdroj: Journal of Signal Processing Systems (JSPS)
Journal of Signal Processing Systems (JSPS), 2022, pp.1. ⟨10.1007/s11265-022-01797-w⟩
Journal of Signal Processing Systems
Journal of Signal Processing Systems, 2022, ⟨10.1007/s11265-022-01797-w⟩
ISSN: 1939-8018
1939-8115
DOI: 10.1007/s11265-022-01797-w⟩
Popis: International audience; The wide landscape of memory-hungry and compute-intensive Convolutional Neural Networks (CNNs) is quickly changing. CNNs are continuously evolving by introducing new layers or optimization strategies to either improve accuracy, reduce memory and computational needs or both. Moving such algorithms to on-device enables smarter edge products. However, hardware designers find this constant evolution hard to master, which keeps CNN accelerators one step behind. More approaches are using reconfigurable hardware, such as FPGAs, to design customized inference accelerators that are more suited to the newly-emerging CNN algorithms. Moreover, high-level design techniques, such as High-Level Synthesis (HLS), are adopted to address the time-consuming RTL-based design and the design space exploration problems. HLS allows generating RTL source code from high-level descriptions. This paper presents a hardware accelerator generation framework targeting FPGAs that relies on two steps. The first step characterizes the input CNN and produces hardware-aware metrics. The second step exploits the generated metrics to produce an optimized C-HLS source code for each layer of the input CNN, then it uses an HLS tool to generate a synthesizable RTL representation of the inference accelerator. The main goal of this approach is to reduce the gap between the evolving CNNs and the hardware accelerators, thus reducing design time of new systems.
Databáze: OpenAIRE