Efficient Memory Organization for DNN Hardware Accelerator Implementation on PSoC
Autor: | Daniel Gutierrez-Galan, Antonio Rios-Navarro, Juan Pedro Dominguez-Morales, Lourdes Duran-Lopez, Manuel Domínguez-Morales, Ricardo Tapiador-Morales, Enrique Piñero-Fuentes |
---|---|
Přispěvatelé: | Universidad de Sevilla. Departamento de Arquitectura y Tecnología de Computadores, Universidad de Sevilla. TEP108: Robótica y Tecnología de Computadores |
Rok vydání: | 2021 |
Předmět: |
Computer Networks and Communications
Computer science Embedded systems Memory organization lcsh:TK7800-8360 02 engineering and technology PSoC Gate array 0202 electrical engineering electronic engineering information engineering Electrical and Electronic Engineering Field-programmable gate array hardware accelerator FPGA business.industry Deep learning lcsh:Electronics 020208 electrical & electronic engineering Frame (networking) deep learning Memory organisation memory organization Hardware and Architecture Control and Systems Engineering Embedded system Signal Processing Hardware acceleration 020201 artificial intelligence & image processing embedded systems Artificial intelligence Hardware accelerator business |
Zdroj: | idUS: Depósito de Investigación de la Universidad de Sevilla Universidad de Sevilla (US) Electronics Volume 10 Issue 1 Electronics, Vol 10, Iss 94, p 94 (2021) idUS. Depósito de Investigación de la Universidad de Sevilla instname |
Popis: | The use of deep learning solutions in different disciplines is increasing and their algorithms are computationally expensive in most cases. For this reason, numerous hardware accelerators have appeared to compute their operations efficiently in parallel, achieving higher performance and lower latency. These algorithms need large amounts of data to feed each of their computing layers, which makes it necessary to efficiently handle the data transfers that feed and collect the information to and from the accelerators. For the implementation of these accelerators, hybrid devices are widely used, which have an embedded computer, where an operating system can be run, and a field-programmable gate array (FPGA), where the accelerator can be deployed. In this work, we present a software API that efficiently organizes the memory, preventing reallocating data from one memory area to another, which improves the native Linux driver with a 85% speed-up and reduces the frame computing time by 28% in a real application. Spanish Agencia Estatal de Investigación (AEI) project MINDROB: “Percepción y Cognición Neuromórfica para Actuación Robótica de Alta Velocidad PID2019- 105556GB-C33 Spanish Agencia Estatal de Investigación (AEI) project MINDROB: “Percepción y Cognición Neuromórfica para Actuación Robótica de Alta Velocidad AEI/10.13039/501100011033 |
Databáze: | OpenAIRE |
Externí odkaz: |