Autor: |
Young, Jeffrey S., Hein, Eric, Eswar, Srinivas, Lavin, Patrick, Li, Jiajia, Riedy, Jason, Vuduc, Richard, Conte, Thomas M. |
Rok vydání: |
2018 |
Předmět: |
|
Zdroj: |
Parallel Computing, 2019, ISSN 0167-8191 |
Druh dokumentu: |
Working Paper |
DOI: |
10.1016/j.parco.2019.04.012 |
Popis: |
The Emu Chick is a prototype system designed around the concept of migratory memory-side processing. Rather than transferring large amounts of data across power-hungry, high-latency interconnects, the Emu Chick moves lightweight thread contexts to near-memory cores before the beginning of each memory read. The current prototype hardware uses FPGAs to implement cache-less "Gossamer cores for doing computational work and a stationary core to run basic operating system functions and migrate threads between nodes. In this multi-node characterization of the Emu Chick, we extend an earlier single-node investigation (Hein, et al. AsHES 2018) of the the memory bandwidth characteristics of the system through benchmarks like STREAM, pointer chasing, and sparse matrix-vector multiplication. We compare the Emu Chick hardware to architectural simulation and an Intel Xeon-based platform. Our results demonstrate that for many basic operations the Emu Chick can use available memory bandwidth more efficiently than a more traditional, cache-based architecture although bandwidth usage suffers for computationally intensive workloads like SpMV. Moreover, the Emu Chick provides stable, predictable performance with up to 65% of the peak bandwidth utilization on a random-access pointer chasing benchmark with weak locality. |
Databáze: |
arXiv |
Externí odkaz: |
|