FPGA Design of High-Speed Convolutional Neural Network Hardware Accelerator

Autor:	Abdallah S. Mohamed, Ziad Ibrahim, Ahmed J. Abd El-Maksoud, Amr Eid, Fatma Khaled, Amr Adel, Farida Khaled, Eman El Mandouh, Ahmed Tarek, Hassan Mostafa
Rok vydání:	2021
Předmět:	Speedup Parallel processing (DSP implementation) Computer science business.industry Process (computing) Hardware acceleration business Field-programmable gate array Convolutional neural network Bottleneck Computer hardware Convolution
Zdroj:	2021 3rd Novel Intelligent and Leading Emerging Sciences Conference (NILES).
DOI:	10.1109/niles53778.2021.9600555
Popis:	Convolutional Neural Networks get increasingly importance nowadays as they enable machines to interact with the surrounding environment, which paves the way for computer vision applications. FPGA implementations of CNN architectures have higher speed and lower power consumption compared to GPUs and CPUs. This paper proposes a high-speed hardware accelerator on FPGA for SqueezeNet CNN to accelerate its processing without decreasing the classification accuracy. Several ideas are applied to solve the memory bottleneck issue such as using Ping-Pong memory and deploying several FIFOs in the design. The architecture is built as a pipelined unit to process SqueezeNet CNN layer by layer. Different parallelism techniques are applied while processing the convolution layers to speedup layers processing. Moreover, the proposed accelerator classifies 248.76 fps at a frequency of 100MHz, and 427.4 fps at a frequency of 172 MHz. The proposed accelerator is implemented on Virtex-7 FPGA, and overcomes Geforce RTX 2080Ti GPU and several SqueezeNet FPGA implementations.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::a6c6b606d2330e3ea245a150afba4b7f https://doi.org/10.1109/niles53778.2021.9600555 Zobrazit plný text záznamu