A low-level software-based fault tolerance approach to detect SEUs in GPUs' register files
Autor: | Márcio A D Gonçalves, Mateus Saquetti, Jose Rodrigo Azambuja, Fernanda Lima Kastensmidt |
---|---|
Rok vydání: | 2017 |
Předmět: |
0209 industrial biotechnology
010308 nuclear & particles physics business.industry Computer science CPU cache Processor register Graphics processing unit Fault tolerance Hardware_PERFORMANCEANDRELIABILITY 02 engineering and technology Fault injection Condensed Matter Physics 01 natural sciences Atomic and Molecular Physics and Optics Stack register Surfaces Coatings and Films Electronic Optical and Magnetic Materials 020901 industrial engineering & automation Software Embedded system Software fault tolerance 0103 physical sciences Electrical and Electronic Engineering Safety Risk Reliability and Quality business |
Zdroj: | Microelectronics Reliability. :665-669 |
ISSN: | 0026-2714 |
Popis: | This paper presents an approach based on software-based fault tolerance techniques applied at low abstraction level to detect SEU faults in register files of Graphics Processing Units. SEU faults have a major influence on such architectures, especially affecting register files and cache memory. In order to harden the system's register files, software-based techniques are presented and tuned to detect faults in vector, address, and predicate registers. A fault injection campaign at Register Transfer Level is performed on the register files using a G80 general purpose graphics processing unit running four case-study applications. Results show reduction in errors up to 100% and overhead costs in execution time up to 1.78 times the original values. |
Databáze: | OpenAIRE |
Externí odkaz: |