On the Resilience of Deep Learning for Reduced-voltage FPGAs

Autor: Kamyar Givaki, S. M. Reza Tayaranian, Adrian Cristal, Osman Unsal, Ahmad Khonsari, Dara Rahmati, Reza Hojabr, Behzad Salami, Saeid Gorgin
Přispěvatelé: Universitat Politècnica de Catalunya. Doctorat en Arquitectura de Computadors, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Barcelona Supercomputing Center, Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
Jazyk: angličtina
Předmět:
FOS: Computer and information sciences
Computer Science - Machine Learning
Computer science
Activation function
Voltage underscaling
02 engineering and technology
Hardware_PERFORMANCEANDRELIABILITY
01 natural sciences
010305 fluids & plasmas
law.invention
Machine Learning (cs.LG)
law
0103 physical sciences
0202 electrical engineering
electronic engineering
information engineering

Training
Neural and Evolutionary Computing (cs.NE)
Resilience (network)
Field-programmable gate array
Informàtica::Arquitectura de computadors [Àrees temàtiques de la UPC]
FPGA
020203 distributed computing
Matrius de portes programables per l'usuari
Resilience
business.industry
Deep learning
Càlcul intensiu (Informàtica) -- Consum d'energia
Transistor
Computer Science - Neural and Evolutionary Computing
Field programmable gate arrays
Fault injection
High performance computing -- Energy consumption
Computer engineering
Hardware acceleration
Artificial intelligence
Hardware accelerator
business
Voltage
DNN
Zdroj: 2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
PDP
DOI: 10.1109/pdp50117.2020.00023
Popis: Deep Neural Networks (DNNs) are inherently computation-intensive and also power-hungry. Hardware accelerators such as Field Programmable Gate Arrays (FPGAs) are a promising solution that can satisfy these requirements for both embedded and High-Performance Computing (HPC) systems. In FPGAs, as well as CPUs and GPUs, aggressive voltage scaling below the nominal level is an effective technique for power dissipation minimization. Unfortunately, bit-flip faults start to appear as the voltage is scaled down closer to the transistor threshold due to timing issues, thus creating a resilience issue.This paper experimentally evaluates the resilience of the training phase of DNNs in the presence of voltage underscaling related faults of FPGAs, especially in on-chip memories. Toward this goal, we have experimentally evaluated the resilience of LeNet-5 and also a specially designed network for CIFAR-10 dataset with different activation functions of Rectified Linear Unit (Relu) and Hyperbolic Tangent (Tanh). We have found that modern FPGAs are robust enough in extremely low-voltage levels and that low-voltage related faults can be automatically masked within the training iterations, so there is no need for costly software-or hardware-oriented fault mitigation techniques like ECC. Approximately 10% more training iterations are needed to fill the gap in the accuracy. This observation is the result of the relatively low rate of undervolting faults, i.e.
Databáze: OpenAIRE