Improving the Robustness of Redundant Execution with Register File Randomization

Autor: Tuzov, Ilya, Andreu, Pablo, Medina, Laura, Picornell-Sanjuan, Tomás, Robles Martínez, Antonio, López Rodríguez, Pedro Juan, Flich Cardo, José, Hernández Luz, Carles
Jazyk: angličtina
Předmět:
Zdroj: 2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)
DOI: 10.1109/iccad51958.2021.9643466
Popis: [EN] Staggered Redundant execution (SRE) is a fault-tolerance mechanism that has been widely deployed in the context of safety-critical applications. SRE not only protects the system in the presence of faults but also helps relaxing safety requirements of individual elements. However, in this paper, we show that SRE does not effectively protect the system against a wide range of faults and thus, new mechanisms to increase the diversity of homogeneous cores are needed. In this paper, we propose Register File Randomization (RFR), a low-cost diversity mechanism that significantly increases the robustness of homogeneous multicores in front of common-cause faults (CCFs) and register file wearout. Our results show that RFR completely removes the failure rate for register file CCFs for certain workloads and reduces by a factor of 5X the impact of stress related register file aging for the workloads analysed. Our implementation requires less than 50 RTL lines of code and the area (FPGA logic) overhead of RFR is less than 0.2% of a 64-bit RISC-V core FPGA implementation.
This work has received funding from the ECSEL Joint Undertaking (JU) under grant agreement No 877056 and the Agencia Estatal de Investigacion from Spain under grant agreement no. PCI2020-112092, and from the the European Unions Horizon 2020 research and innovation programme under grant agreement no. 871467.
Databáze: OpenAIRE