On the Resilience of Deep Learning for Reduced-voltage FPGAs
Autor: | Kamyar Givaki, S. M. Reza Tayaranian, Adrian Cristal, Osman Unsal, Ahmad Khonsari, Dara Rahmati, Reza Hojabr, Behzad Salami, Saeid Gorgin |
---|---|
Přispěvatelé: | Universitat Politècnica de Catalunya. Doctorat en Arquitectura de Computadors, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Barcelona Supercomputing Center, Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions |
Jazyk: | angličtina |
Předmět: |
FOS: Computer and information sciences
Computer Science - Machine Learning Computer science Activation function Voltage underscaling 02 engineering and technology Hardware_PERFORMANCEANDRELIABILITY 01 natural sciences 010305 fluids & plasmas law.invention Machine Learning (cs.LG) law 0103 physical sciences 0202 electrical engineering electronic engineering information engineering Training Neural and Evolutionary Computing (cs.NE) Resilience (network) Field-programmable gate array Informàtica::Arquitectura de computadors [Àrees temàtiques de la UPC] FPGA 020203 distributed computing Matrius de portes programables per l'usuari Resilience business.industry Deep learning Càlcul intensiu (Informàtica) -- Consum d'energia Transistor Computer Science - Neural and Evolutionary Computing Field programmable gate arrays Fault injection High performance computing -- Energy consumption Computer engineering Hardware acceleration Artificial intelligence Hardware accelerator business Voltage DNN |
Zdroj: | 2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP) UPCommons. Portal del coneixement obert de la UPC Universitat Politècnica de Catalunya (UPC) PDP |
DOI: | 10.1109/pdp50117.2020.00023 |
Popis: | Deep Neural Networks (DNNs) are inherently computation-intensive and also power-hungry. Hardware accelerators such as Field Programmable Gate Arrays (FPGAs) are a promising solution that can satisfy these requirements for both embedded and High-Performance Computing (HPC) systems. In FPGAs, as well as CPUs and GPUs, aggressive voltage scaling below the nominal level is an effective technique for power dissipation minimization. Unfortunately, bit-flip faults start to appear as the voltage is scaled down closer to the transistor threshold due to timing issues, thus creating a resilience issue.This paper experimentally evaluates the resilience of the training phase of DNNs in the presence of voltage underscaling related faults of FPGAs, especially in on-chip memories. Toward this goal, we have experimentally evaluated the resilience of LeNet-5 and also a specially designed network for CIFAR-10 dataset with different activation functions of Rectified Linear Unit (Relu) and Hyperbolic Tangent (Tanh). We have found that modern FPGAs are robust enough in extremely low-voltage levels and that low-voltage related faults can be automatically masked within the training iterations, so there is no need for costly software-or hardware-oriented fault mitigation techniques like ECC. Approximately 10% more training iterations are needed to fill the gap in the accuracy. This observation is the result of the relatively low rate of undervolting faults, i.e. |
Databáze: | OpenAIRE |
Externí odkaz: |