Zobrazeno 1 - 10
of 24
pro vyhledávání: '"Hari Subramoni"'
Autor:
Ammar Ahmad Awan, Karthik Vadambacheri Manian, Dhabaleswar K. Panda, Hari Subramoni, Ching-Hsiang Chu
Publikováno v:
Parallel Computing. 85:141-152
Traditionally, MPI runtimes have been designed for clusters with a large number of nodes. However, with the advent of MPI+CUDA applications and GPU clusters with a relatively smaller number of nodes, efficient communication schemes need to be designe
Publikováno v:
MLHPC/AI4S@SC
The growth of big data applications during the last decade has led to a surge in the deployment and popularity of machine learning (ML) libraries. On the other hand, the high performance offered by GPUs makes them well suited for ML problems. To take
Autor:
D.K. Panda, Jahanzeb Maqbool Hashmi, Quentin Anthony, Asmaa M. Aljuhani, Raghu Machiraju, Arpan Jain, Anil V. Parwani, Hari Subramoni, Ammar Ahmad Awan
Publikováno v:
SC
Data-parallelism has become an established paradigm to train DNNs that fit inside GPU memory on large-scale HPC systems. However, model-parallelism is required to train out-of-core DNNs. In this paper, we deal with emerging requirements brought forwa
Autor:
Dhabaleswar K. Panda, Jahanzeb Maqbool Hashmi, Bharath Ramesh, Mohammadreza Bayatpour, Hari Subramoni, Shulei Xu
Publikováno v:
IPDPS
Modern multi-/many-cores offer higher core-density, hardware multi-threading, deeper memory hierarchies, and diverse architectural capabilities. While emerging cloud-based HPC systems are able to deliver near-native performance, they bring more diver
Autor:
Dhabaleswar K. Panda, Mohammadreza Bayatpour, Hari Subramoni, S. Mahdieh Ghazimirsaeed, Shulei Xu
Publikováno v:
CCGRID
Message Passing Interface (MPI) standard uses (source rank, tag, and communicator id) to properly place the incoming data into the application receive buffer. The act of searching through the receive queues and finding the appropriate match is called
Autor:
Hari Subramoni, Dhabaleswar K. Panda, Jahanzeb Hashmi Maqbool, Bharath Ramesh, Sourav Chakraborty, Kaushik Kandadi Suresh, Seyedeh Mahdieh Ghazimirsaeed, Mohammadreza Bayatpour
Publikováno v:
Lecture Notes in Computer Science ISBN: 9783030507428
Overlap of computation and communication is critical for good application-level performance. Modern high-performance networks offer Hardware-assisted tag matching and rendezvous offload to enable communication progress without involving the host CPU.
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::885fbdf654ef9f529b56431b1e9791a1
https://doi.org/10.1007/978-3-030-50743-5_26
https://doi.org/10.1007/978-3-030-50743-5_26
Publikováno v:
Lecture Notes in Computer Science ISBN: 9783030507428
ISC
ISC
To reduce the training time of large-scale Deep Neural Networks (DNNs), Deep Learning (DL) scientists have started to explore parallelization strategies like data-parallelism, model-parallelism, and hybrid-parallelism. While data-parallelism has been
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::25bf943942cf41dcfa163b10e78babde
https://doi.org/10.1007/978-3-030-50743-5_5
https://doi.org/10.1007/978-3-030-50743-5_5
Publikováno v:
CLUSTER
The recent surge of Deep Learning (DL) models and applications can be attributed to the rise in computational resources, availability of large-scale datasets, and accessible DL frameworks such as TensorFlow and PyTorch. Because these frameworks have
Publikováno v:
Journal of Computational Science. 52:101208
High-Performance Computing (HPC) research, from hardware and software to the end applications, provides remarkable computing power to help scientists solve complex problems in science, engineering, or even daily business. Over the last decades, Messa
Autor:
Hari Subramoni, Mohammadreza Bayatpour, Jahanzeb Maqbool Hashmi, Dhabaleswar K. Panda, Sourav Chakraborty
Publikováno v:
IPDPS
Derived datatypes are commonly used in MPI applications to exchange non-contiguous data among processes. However, state-of-the-art MPI libraries do not offer efficient processing of derived datatypes and often rely on packing and unpacking the data a