Embedded Streaming Deep Neural Networks Accelerator With Applications

Autor: Eugenio Culurciello, Berin Martini, Jonghoon Jin, Aysegul Dundar
Rok vydání: 2017
Předmět:
Zdroj: IEEE Transactions on Neural Networks and Learning Systems. 28:1572-1583
ISSN: 2162-2388
2162-237X
DOI: 10.1109/tnnls.2016.2545298
Popis: Deep convolutional neural networks (DCNNs) have become a very powerful tool in visual perception. DCNNs have applications in autonomous robots, security systems, mobile phones, and automobiles, where high throughput of the feedforward evaluation phase and power efficiency are important. Because of this increased usage, many field-programmable gate array (FPGA)-based accelerators have been proposed. In this paper, we present an optimized streaming method for DCNNs' hardware accelerator on an embedded platform. The streaming method acts as a compiler, transforming a high-level representation of DCNNs into operation codes to execute applications in a hardware accelerator. The proposed method utilizes maximum computational resources available based on a novel-scheduled routing topology that combines data reuse and data concatenation. It is tested with a hardware accelerator implemented on the Xilinx Kintex-7 XC7K325T FPGA. The system fully explores weight-level and node-level parallelizations of DCNNs and achieves a peak performance of 247 G-ops while consuming less than 4 W of power. We test our system with applications on object classification and object detection in real-world scenarios. Our results indicate high-performance efficiency, outperforming all other presented platforms while running these applications.
Databáze: OpenAIRE