Výsledky vyhledávání - "Benjamin Klenk"

A Case For Intra-rack Resource Disaggregation in HPC

Autor: George Michelogiannakis, Benjamin Klenk, Brandon Cook, Min Yee Teh, Madeleine Glick, Larry Dennison, Keren Bergman, John Shalf

Publikováno v: ACM Transactions on Architecture and Code Optimization, vol 19, iss 2

The expected halt of traditional technology scaling is motivating increased heterogeneity in high-performance computing (HPC) systems with the emergence of numerous specialized accelerators. As heterogeneity increases, so does the risk of underutiliz

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c0653e28787b2d03eb6080694bf9a148
https://escholarship.org/uc/item/73x617x8

Zobrazit plný text záznamu

SiP-ML

Autor: Benjamin Klenk, Madeleine Glick, Eiman Ebrahimi, Mehrdad Khani, Manya Ghobadi, Ziyi Zhu, Mohammad Alizadeh, Keren Bergman, Amin Vahdat

Publikováno v: SIGCOMM

This paper proposes optical network interconnects as a key enabler for building high-bandwidth ML training clusters with strong scaling properties. Our design, called SiP-ML, accelerates the training time of popular DNN models using silicon photonics

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::bde66cf5e93c80306d4a9ce2f110011a
https://doi.org/10.1145/3452296.3472900

Zobrazit plný text záznamu

An In-Network Architecture for Accelerating Shared-Memory Multiprocessor Collectives

Autor: Greg Thorson, Benjamin Klenk, Larry R. Dennison, Nan Jiang

Publikováno v: ISCA

The slowdown of single-chip performance scaling combined with the growing demands of computing ever larger problems efficiently has led to a renewed interest in distributed architectures and specialized hardware. Dedicated accelerators for common or

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::6ac2f5d177139201afa8749e1aedaafb
https://doi.org/10.1109/isca45697.2020.00085

Zobrazit plný text záznamu

Why Data Science and Machine Learning Need Silicon Photonics

Autor: Benjamin Klenk, Larry R. Dennison

Publikováno v: OFC

Training deep neural networks demands vast amounts of computation, provided by large distributed systems. The increasing demand for bandwidth will exceed the limits of electrical and non-integrated optical signaling and will require integrated optics

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::a5b0686e80d0d91dbbd17678b0ee49df
https://doi.org/10.1364/ofc.2020.m4f.6

Zobrazit plný text záznamu

Relaxations for High-Performance Message Passing on Massively Parallel SIMT Processors

Autor: Larry R. Dennison, Benjamin Klenk, Hans Eberle, Holger Froening

Publikováno v: IPDPS

Accelerators, such as GPUs, have proven to be highly successful in reducing execution time and power consumption of compute-intensive applications. Even though they are already used pervasively, they are typically supervised by general-purpose CPUs,

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::4fb607d891bb2cf65771c50fbc0c57a0
https://doi.org/10.1109/ipdps.2017.94

Zobrazit plný text záznamu

An Overview of MPI Characteristics of Exascale Proxy Applications

Autor: Holger Fröning, Benjamin Klenk

Publikováno v: Lecture Notes in Computer Science ISBN: 9783319586663
ISC

The scale of applications and computing systems is tremendously increasing and needs to increase even more to realize exascale systems. As the number of nodes keeps growing, communication has become key to high performance.

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::245ec13492ede026d3661c6ad5bcb246
https://doi.org/10.1007/978-3-319-58667-0_12

Zobrazit plný text záznamu

Analyzing GPU-controlled communication with dynamic parallelism in terms of performance and energy

Autor: Benjamin Klenk, Holger Fröning, Lena Oden

Intra-GPU synchronization is a problem for GPU controlled communication.Options, based on dynamic parallelism provide on-device synchronization.GPU controlled communication have a lower performance than CPU assisted approaches.Relieving the CPU from

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::fe878a034c07a6763fc42353aa487402
https://publica.fraunhofer.de/handle/publica/245304

Zobrazit plný text záznamu

Analyzing communication models for distributed thread-collaborative processors in terms of energy and time

Autor: Holger Fröning, Benjamin Klenk, Lena Oden

Publikováno v: ISPASS

Accelerated computing has become pervasive for increasing the computational power and energy efficiency in terms of GFLOPs/Watt. For application areas with highest demands, for instance high performance computing, data warehousing and high performanc

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::6f6589eeb9fb00728f6c7c4c3118b189
https://doi.org/10.1109/ispass.2015.7095817

Zobrazit plný text záznamu

Energy-Efficient Stencil Computations on Distributed GPUs Using Dynamic Parallelism and GPU-Controlled Communication

Autor: Benjamin Klenk, Holger Fröning, Lena Oden

Publikováno v: E2SC@SC

GPUs are widely used in high performance computing, due to their high computational power and high performance per Watt. Still, one of the main bottlenecks of GPU-accelerated cluster computing is the data transfer between distributed GPUs. This not o

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::a46ee3b412df9b3face42f59950dc347
https://doi.org/10.1109/e2sc.2014.14

Zobrazit plný text záznamu

Energy-Efficient Collective Reduce and Allreduce Operations on Distributed GPUs

Autor: Lena Oden, Benjamin Klenk, Holger Fröning

Publikováno v: CCGRID

GPUs gain high popularity in High Performance Computing, due to their massive parallelism and high performance per Watt. Despite their popularity, data transfer between multiple GPUs in a cluster remains a problem. Most communication models require t

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::85c5ca24808254c38b5cf1bef89a8364
https://doi.org/10.1109/ccgrid.2014.21

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání