G-SEAP: Analyzing and characterizing soft-error aware approximation in GPGPUs
Autor: | Jingweijia Tan, Xiaohui Wei, Shang Gao, Ruyu Zhang, Hengshan Yue, Lina Li |
---|---|
Rok vydání: | 2020 |
Předmět: |
Correctness
Computer Networks and Communications Dataflow Computer science Reliability (computer networking) 020206 networking & telecommunications 02 engineering and technology Soft error Computer engineering Hardware and Architecture 0202 electrical engineering electronic engineering information engineering Overhead (computing) 020201 artificial intelligence & image processing General-purpose computing on graphics processing units Error detection and correction Massively parallel Software |
Zdroj: | Future Generation Computer Systems. 109:262-274 |
ISSN: | 0167-739X |
DOI: | 10.1016/j.future.2020.03.040 |
Popis: | As General-Purpose Graphics Processing Units (GPGPUs) become pervasive for the High-Performance Computing (HPC), ensuring that programs can be protected from soft errors has become increasingly important. Soft errors may cause Silent Data Corruptions (SDCs), which produces erroneous execution results silently. Due to the massive parallelism of GPGPUs, fully protecting them against soft errors introduces nontrivial overhead. Fortunately, imprecise execution outcomes are inherently tolerable for some HPC programs due to the nature of these applications. Leveraging the feature, selective soft error protection can be applied to reduce energy consumptions. In this work, we first propose a GPGPU-based Soft-Error aware APproximation analysis framework (G-SEAP) to characterize the approximation characteristics of soft errors. Based on G-SEAP, we perform an exhaustive analysis for 17 representative HPC benchmarks and observe 72.7% of SDCs on average are approximable. We also observe that the dataflow of application, kernel function reliability requirement, instruction-type, and data bit-location are all important factors for program’s correctness. Lastly, according to the observations, we further design an approximate Error Correction Codes (ECCs) mechanism and an approximate instruction duplication technique to illustrate how G-SEAP provides useful guidance for energy-efficient soft-error elimination in GPGPUs. |
Databáze: | OpenAIRE |
Externí odkaz: |