Výsledky vyhledávání - "Hari Subramoni"

FALCON-X: Zero-copy MPI derived datatype processing on modern CPU and GPU architectures

Autor: Sourav Chakraborty, Jahanzeb Maqbool Hashmi, Hari Subramoni, Mohammadreza Bayatpour, Ching-Hsiang Chu, Dhabaleswar K. Panda

Publikováno v: Journal of Parallel and Distributed Computing. 144:1-13

This paper addresses the challenges of MPI derived datatype processing and proposes FALCON-X — A Fast and Low-overhead Communication framework for optimized zero-copy intra-node derived datatype communication on emerging CPU/GPU architectures. We q

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::0e8b5932a9be6741c3358f6e28c7ee2a
https://doi.org/10.1016/j.jpdc.2020.05.008

Zobrazit plný text záznamu

Communication Profiling and Characterization of Deep-Learning Workloads on Clusters With High-Performance Interconnects

Autor: Ching-Hsiang Chu, Hari Subramoni, Ammar Ahmad Awan, D.K. Panda, Arpan Jain

Publikováno v: IEEE Micro. 40:35-43

Heterogeneous high-performance computing systems with GPUs are equipped with high-performance interconnects like InfiniBand, Omni-Path, PCIe, and NVLink. However, little exists in the literature that captures the performance impact of these interconn

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::2c0bd3b2de68b975db3a51d2a85ea854
https://doi.org/10.1109/mm.2019.2949986

Zobrazit plný text záznamu

Efficient design for MPI asynchronous progress without dedicated resources

Autor: Sourav Chakraborty, Hari Subramoni, Mohammadreza Bayatpour, Pouya Kousha, Dhabaleswar K. Panda, Amit Ruhela

Publikováno v: Parallel Computing. 85:13-26

The overlap of computation and communication is critical for good performance of many HPC applications. State-of-the-art designs for the asynchronous progress require specially designed hardware resources (advanced switches or network interface cards

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::b67d407c42101a35f92cd5d17ed1d5d6
https://doi.org/10.1016/j.parco.2019.03.003

Zobrazit plný text záznamu

Optimized large-message broadcast for deep learning workloads: MPI, MPI+NCCL, or NCCL2?

Autor: Ammar Ahmad Awan, Karthik Vadambacheri Manian, Dhabaleswar K. Panda, Hari Subramoni, Ching-Hsiang Chu

Publikováno v: Parallel Computing. 85:141-152

Traditionally, MPI runtimes have been designed for clusters with a large number of nodes. However, with the advent of MPI+CUDA applications and GPU clusters with a relatively smaller number of nodes, efficient communication schemes need to be designe

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::99f662162a374003de6f2c3f3326e16c
https://doi.org/10.1016/j.parco.2019.03.005

Zobrazit plný text záznamu

MPI performance engineering with the MPI tool interface: The integration of MVAPICH and TAU

Autor: Amit Ruhela, Dhabaleswar K. Panda, Srinivasan Ramesh, Hari Subramoni, Allen D. Malony, Aurèle Mahéo, Sameer Shende

Publikováno v: Parallel Computing. 77:19-37

The desire for high performance on scalable parallel systems is increasing the complexity and tunability of MPI implementations. The MPI Tools Information Interface (MPI_T) introduced as part of the MPI 3.0 standard provides an opportunity for perfor

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::19d24c41779c70f01aabbb7750b906e8
https://doi.org/10.1016/j.parco.2018.05.003

Zobrazit plný text záznamu

Designing a ROCm-Aware MPI Library for AMD GPUs: Early Experiences

Autor: Hari Subramoni, Jahanzeb Maqbool Hashmi, Dhabaleswar K. Panda, Ching-Hsiang Chu, Chen-Chun Chen, Kawthar Shafie Khorassani

Publikováno v: Lecture Notes in Computer Science ISBN: 9783030787127
ISC

Due to the emergence of AMD GPUs and their adoption in upcoming exascale systems (e.g. Frontier), it is pertinent to have scientific applications and communication middlewares ported and optimized for these systems. Radeon Open Compute (ROCm) platfor

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::7b121328f3fe8240f55047725cb698e2
https://doi.org/10.1007/978-3-030-78713-4_7

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání