Zobrazeno 1 - 10
of 51
pro vyhledávání: '"María Jesús Garzarán"'
Autor:
Sayantan Sur, Erik Paulson, Hajime Fujita, María Jesús Garzarán, Charles J. Archer, Chongxiao Cao
Publikováno v:
Parallel Computing. 87:1-10
The Message Passing Interface (MPI) standard supports Remote Memory Access (RMA) operations, where a process can read or write memory of another process without requiring the target process to be involved in the communication. This enables new more e
Publikováno v:
Journal of Parallel and Distributed Computing. 120:282-294
Several approaches implement efficient BFS algorithms for multicores and for GPUs. However, when targeting heterogeneous architectures, it is still an open problem how to distribute the work among the CPU cores and the accelerators. In this paper, we
Publikováno v:
ISCA
The same flexibility that makes dynamic scripting languages appealing to programmers is also the primary cause of their low performance. To access objects of potentially different types, the compiler creates a dispatcher with a series of if statement
Publikováno v:
EuroMPI
Triggered operations and counting events or counters are building blocks used by communication libraries, such as MPI, to offload collective operations to the Host Fabric Interface (HFI) or Network Interface Card (NIC). Triggered operations can be us
Autor:
Shintaro Iwasaki, Chongxiao Cao, Charles J. Archer, Hajime Fujita, Yanfei Guo, Pavan Balaji, Min Si, Kenjiro Taura, Jeff R. Hammond, Kenneth Raffenetti, Sagar Thapaliya, María Jesús Garzarán, Mikhail Shiryaev, Michael Chuvelev, Abdelhalim Amer, Michael Alan Blocksome
Publikováno v:
ICS
Efforts to mitigate lock contention from concurrent threaded accesses to MPI have reduced contention through fine-grained locking, avoided locking altogether by offloading communication to dedicated threads, or alleviated negative side effects from c
Publikováno v:
HPCA
Autor:
Marc Gamell Balmana, Rashid Kaleem, Alexander Sannikov, María Jesús Garzarán, Dmitry Durnov, Akhil Langer, Surabhi Jain
Publikováno v:
SC
Collective operations are used in MPI programs to express common communication patterns, collective computations, or synchronization. In many collectives, such as MPI_Allreduce, the intra-node component of the collective lies on the critical path, as
Autor:
Angeles Navarro, Antonio Vilches, Ruben Gran, Rafael Asenjo, María Jesús Garzarán, Francisco Corbera
Publikováno v:
IEEE Transactions on Parallel and Distributed Systems. 27:1099-1115
In this paper, we consider the problem of efficiently executing streaming applications on commodity processors composed of several cores and an on-chip GPU. Streaming applications, such as those in vision and video analytic, consist of a pipeline of
Publikováno v:
IPDPS Workshops
Many-core architectures such as the Intel® Xeon PhiTM provide dozens of cores and hundreds of hardware threads. For these machines, a basic MPI implementation is inefficient, as it does not take advantage of the shared data across the ranks on the s
Autor:
Rafael Asenjo, María Jesús Garzarán, Francisco Corbera, Antonio Vilches, Angeles Navarro, Ruben Gran
Publikováno v:
ICCS
Commodity processors are comprised of several CPU cores and one integrated GPU. To fully exploit this type of architectures, one needs to automatically determine how to partition the workload between both devices. This is specially challenging for ir