Popis: |
Faulty cells have become major problems in cost-sensitive main-memory DRAM devices. Conventional solutions to reduce device failure rates due to cells with permanent faults, such as populating spare rows and relying on error-correcting codes, have had limited success due to high area overheads. In this paper, we propose CIDR, a novel cache-inspired DRAM resilience architecture, which substantially reduces the area overhead of handling bit errors from these faulty cells. A DRAM device adopting CIDR has a small cache next to its I/O pads to replace accesses to the addresses that include the faulty cells with ones that correspond to the cache data array. We minimize the energy overhead of accessing the cache tags for every read or write by adding a Bloom filter in front of the cache. The augmented cache is programmed once during the testing phase and is out of the critical path on normal accesses because both cache and DRAM arrays are accessed in parallel, making CIDR transparent to existing processor-memory interfaces. Compared to the conventional architecture relying on spare rows, CIDR lowers the area overhead of achieving equal failure rates over a wide range of single-bit error rates, such as 23.6 $\times$ lower area overhead for a bit-error rate of 10 $^{-5}$ and a device failure rate of 10 $^{-3}$ . |