Popis: |
Transient and permanent errors in memory and CPUs occur with alarming frequency. Although most of these errors are masked at the hardware level or result in crashes, a non-negligible number of them leads to Silent Data Corruptions (SDCs), i.e., incorrect results of computations. Safety-critical programs require a very high level of confidence that such faults are detected and not propagated to the outside. Unfortunately, state-of-the-art fault detection techniques generally assume a limited Single Event Upset fault model, concentrating only on transient faults.We present a#x03B4;-encoding: a software-only approach to detect hardware faults with very high probability. a#x03B4;-encoding makes no assumptions on the rate and type of faults. Our approach combines AN codes and duplicated instructions to harden programs against transient and permanent hardware errors. Our evaluation shows that a#x03B4;-encoding detects 99.997% of all injected errors with performance slowdown of 2 -- 4 times. |