Zobrazeno 1 - 10
of 21
pro vyhledávání: '"Gwangsun Kim"'
Autor:
Hyungkyu Ham, Hyunuk Cho, Minjae Kim, Jueon Park, Jeongmin Hong, Hyojin Sung, Eunhyeok Park, Euicheol Lim, Gwangsun Kim
Publikováno v:
IEEE Access, Vol 12, Pp 142651-142667 (2024)
Currently, GPUs face significant challenges due to limited off-chip bandwidth (BW) and memory capacity during DNN training. To address these bottlenecks, we propose a memory access-triggered near-data processing matNDP architecture that offloads memo
Externí odkaz:
https://doaj.org/article/5e0733ab887241b48181918dc53d2a04
Publikováno v:
IEEE Computer Architecture Letters. 21:61-64
Autor:
Eunhyeok Park, Jeongmin Hong, Euicheol Lim, Hyunuk Cho, Minjae Kim, Gwangsun Kim, Jueon Park, Hyungkyu Ham, Hyojin Sung
Publikováno v:
IEEE Computer Architecture Letters. 20:171-174
Publikováno v:
Proceedings of the 49th Annual International Symposium on Computer Architecture.
Autor:
John Kim, Gwangsun Kim, Jung Ho Ahn, Jongwook Chung, Wonjun Song, Hyung-Joon Jung, Jae W. Lee
Publikováno v:
ASPLOS
NUMA (non-uniform memory access) servers are commonly used in high-performance computing and datacenters. Within each server, a processor-interconnect (e.g., Intel QPI, AMD HyperTransport) is used to communicate between the different sockets or nodes
Autor:
Onur Mutlu, Niladrish Chatterjee, Stephen W. Keckler, Gwangsun Kim, Mike O'Connor, Eiman Ebrahimi, Kevin Hsieh, Nandita Vijaykumar
Publikováno v:
ISCA
Main memory bandwidth is a critical bottleneck for modern GPU systems due to limited off-chip pin bandwidth. 3D-stacked memory architectures provide a promising opportunity to significantly alleviate this bottleneck by directly connecting a logic lay
Publikováno v:
IEEE Transactions on Computers. 65:480-494
A cost-efficient network-on-chip is needed in a scalable many-core systems. Recent multicore processors have leveraged a ring topology and hierarchical ring can increase scalability but presents different challenges, including higher hop count and gl
Publikováno v:
ISCA
High-radix topologies in large-scale networks provide low network diameter and high path diversity, but the idle power from high-speed links results in energy inefficiency, especially at low traffic load. In this work, we exploit the high path divers
Publikováno v:
SC
3D-stacked memory devices with processing logic can help alleviate the memory bandwidth bottleneck in GPUs. However, in order for such Near-Data Processing (NDP) memory stacks to be used for different GPU architectures, it is desirable to standardize
Publikováno v:
IEEE Transactions on Computers. 63:1487-1500
Many-core processors will have many processing cores with a network-on-chip (NoC) that provides access to shared resources such as main memory and on-chip caches. However, locally-fair arbitration in multi-stage NoC can lead to globally unfair access