Enabling Concurrent Multithreaded MPI Communication on Multicore Petascale Systems.

Autor: Dózsa, Gábor, Kumar, Sameer, Balaji, Pavan, Buntinas, Darius, Goodell, David, Gropp, William, Ratterman, Joe, Thakur, Rajeev
Zdroj: Recent Advances in the Message Passing Interface; 2010, p11-20, 10p
Abstrakt: With the ever-increasing numbers of cores per node on HPC systems, applications are increasingly using threads to exploit the shared memory within a node, combined with MPI across nodes. Achieving high performance when a large number of concurrent threads make MPI calls is a challenging task for an MPI implementation. We describe the design and implementation of our solution in MPICH2 to achieve high-performance multithreaded communication on the IBM Blue Gene/P. We use a combination of a multichannel-enabled network interface, fine-grained locks, lock-free atomic operations, and specially designed queues to provide a high degree of concurrent access while still maintaining MPI΄s message-ordering semantics. We present performance results that demonstrate that our new design improves the multithreaded message rate by a factor of 3.6 compared with the existing implementation on the BG/P. Our solutions are also applicable to other high-end systems that have parallel network access capabilities. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index