A low-level software-based fault tolerance approach to detect SEUs in GPUs' register files

Autor: Márcio A D Gonçalves, Mateus Saquetti, Jose Rodrigo Azambuja, Fernanda Lima Kastensmidt
Rok vydání: 2017
Předmět:
Zdroj: Microelectronics Reliability. :665-669
ISSN: 0026-2714
Popis: This paper presents an approach based on software-based fault tolerance techniques applied at low abstraction level to detect SEU faults in register files of Graphics Processing Units. SEU faults have a major influence on such architectures, especially affecting register files and cache memory. In order to harden the system's register files, software-based techniques are presented and tuned to detect faults in vector, address, and predicate registers. A fault injection campaign at Register Transfer Level is performed on the register files using a G80 general purpose graphics processing unit running four case-study applications. Results show reduction in errors up to 100% and overhead costs in execution time up to 1.78 times the original values.
Databáze: OpenAIRE