Zobrazeno 1 - 10
of 14
pro vyhledávání: '"Abdelhalim Amer"'
Publikováno v:
IEEE Transactions on Parallel and Distributed Systems. 31:1859-1877
User-level threads have been widely adopted as a means of achieving lightweight concurrent execution without the costs of OS-level threads. Nevertheless, the costs of managing user-level threads represent a performance barrier that dictates how fine
Publikováno v:
ACM Transactions on Parallel Computing. 7:1-32
The popularity of Non-Uniform Memory Access (NUMA) architectures has led to numerous locality-preserving hierarchical lock designs, such as HCLH, HMCS, and cohort locks. Locality-preserving locks trade fairness for higher throughput. Hence, some inst
Autor:
Yanjie Wei, Jeff R. Hammond, Milind Chabbi, Huiwei Lu, Satoshi Matsuoka, Pavan Balaji, Abdelhalim Amer
Publikováno v:
ACM Transactions on Parallel Computing. 5:1-21
In this article, we investigate contention management in lock-based thread-safe MPI libraries. Specifically, we make two assumptions: (1) locks are the only form of synchronization when protecting communication paths; and (2) contention occurs, and t
Publikováno v:
PACT
OpenMP is widely used by a number of applications, computational libraries, and runtime systems. As a result, multiple levels of the software stack use OpenMP independently of one another, often leading to nested parallel regions. Although exploiting
Autor:
Shintaro Iwasaki, Chongxiao Cao, Charles J. Archer, Hajime Fujita, Yanfei Guo, Pavan Balaji, Min Si, Kenjiro Taura, Jeff R. Hammond, Kenneth Raffenetti, Sagar Thapaliya, María Jesús Garzarán, Mikhail Shiryaev, Michael Chuvelev, Abdelhalim Amer, Michael Alan Blocksome
Publikováno v:
ICS
Efforts to mitigate lock contention from concurrent threaded accesses to MPI have reduced contention through fine-grained locking, avoided locking altogether by offloading communication to dedicated threads, or alleviated negative side effects from c
Publikováno v:
SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.
Autor:
Pavan Balaji, Sriram Krishnamoorthy, George Bosilca, Esteban Meneses, Huiwei Lu, Philip Carns, Prateek Jindal, Thomas Herault, Cyril Bordage, Jonathan Lifflander, Kenjiro Taura, Adrián Castelló, Yanhua Sun, Shintaro Iwasaki, Abdelhalim Amer, Pete Beckman, Sangmin Seo, Damien Genet, Marc Snir, Alex Brooks, Laxmikant V. Kale
Publikováno v:
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems, 2018, 29 (3), pp.512-526. ⟨10.1109/TPDS.2017.2766062⟩
Repositori Universitat Jaume I
Universitat Jaume I
IEEE Transactions on Parallel and Distributed Systems, Institute of Electrical and Electronics Engineers, 2018, 29 (3), pp.512-526. ⟨10.1109/TPDS.2017.2766062⟩
IEEE Transactions on Parallel and Distributed Systems, 2018, 29 (3), pp.512-526. ⟨10.1109/TPDS.2017.2766062⟩
Repositori Universitat Jaume I
Universitat Jaume I
IEEE Transactions on Parallel and Distributed Systems, Institute of Electrical and Electronics Engineers, 2018, 29 (3), pp.512-526. ⟨10.1109/TPDS.2017.2766062⟩
International audience; In the past few decades, a number of user-level threading and tasking models have been proposed in the literature to address the shortcomings of OS-level threads, primarily with respect to cost and flexibility. Current state-o
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::bfe58a89003fa42a3cf17b1bdeec0255
https://inria.hal.science/hal-01887586
https://inria.hal.science/hal-01887586
Autor:
Alexander Sannikov, Sangmin Seo, Yanfei Guo, Ken Raffenetti, Paul Fischer, Tomislav Janjusic, Thilina Rathnayake, Michael Alan Blocksome, Jithin Jose, Matthew Otten, Hajime Fujita, Sergey Oblomov, Sayantan Sur, Masamichi Takagi, Pavan Balaji, Masayuki Hatanaka, Misun Min, Abdelhalim Amer, Paul Coffman, Wesley Bland, Akhil Langer, Michael Chuvelev, Dmitry Durnov, Charles J. Archer, Min Si, Lena Oden, Gengbin Zheng, Xin Zhao
Publikováno v:
SC
This paper provides an in-depth analysis of the software overheads in the MPI performance-critical path and exposes mandatory performance overheads that are unavoidable based on the MPI-3.1 specification. We first present a highly optimized implement
Publikováno v:
CCGrid
Concurrent multithreaded access to the Message Passing Interface (MPI) is gaining importance to support emerging hybrid MPI applications. The interoperability between threads and MPI, however, is complex and renders efficient implementations nontrivi
Publikováno v:
PPOPP
The popularity of Non-Uniform Memory Access (NUMA) architectures has led to numerous locality-preserving hierarchical lock designs, such as HCLH, HMCS, and cohort locks. Locality-preserving locks trade fairness for higher throughput. Hence, some inst