A 10-core SoC with 20 Fine-Grain Power Domains for Energy-Proportional Data-Parallel Processing over a Wide Voltage and Temperature Range
Autor: | Thomas Benz, Frank K. Gurkaynak, Fabian Schuiki, Florian Zaruba, Luca Bertaccini, Luca Benini |
---|---|
Přispěvatelé: | Benz T., Bertaccini L., Zaruba F., Schuiki F., Gurkaynak F.K., Benini L. |
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
Power management
Thermal design power Power gating Parallel processing (DSP implementation) Computer science Computer cluster Granularity Temperature distribution Power measurement Power system management Instruction sets Europe Computer architecture Parallel processing Power domains Sleep mode Computational science |
Zdroj: | ESSCIRC 2021-IEEE 47th European Solid State Circuits Conference (ESSCIRC) ESSCIRC |
Popis: | We present Thestral, a 10-core RISC-V chip for energy-proportional parallel computing manufactured in 22 nm FD-SOI technology. Thestral contains a control core and a nine-core compute cluster. Each core features a single-precision floating-point unit (FPU) and an integer processing unit (IPU) and implements custom instruction set architecture (ISA) extensions to improve utilization. The chip features 20 fine-grain power domains: one for each FPU and IPU, as well as one for the entire acceleration cluster. Such aggressive power management granularity is valuable both for extreme-edge computing, where power gating reduces sleep power, and for high-performance computing, where leakage control is required to meet thermal design power constraints and to minimize idle power. We propose a fast and fine-grain power gating architecture with much finer granularity than the state of the art for multi-core computing platforms. A sub-10 ns power-up sequence allows for fine-tuning the compute cluster configuration, powering up only the computational units required for a specific application phase. Our solution enables up to 42% measured power savings for the extreme-edge scenario during sleep mode (@350 MHz, 0.6 V, 25 °C), which is 12.7% more than what can be achieved with aggressive clock-gating. On the other extreme, in an HPC setting, a Thestral-based many-core system running memory-bound applications (@850 MHz, 0.9 V, 75 °C) can save up to 41% power. |
Databáze: | OpenAIRE |
Externí odkaz: |