Communication protocol optimization for enhanced GPU performance

Autor:	Sameh S. Sharkawi, George A. Chochia
Rok vydání:	2020
Předmět:	010302 applied physics Hardware_MEMORYSTRUCTURES General Computer Science CPU cache Computer science InfiniBand 02 engineering and technology Parallel computing computer.software_genre 01 natural sciences Active message Inter-process communication CUDA 020204 information systems Server 0103 physical sciences 0202 electrical engineering electronic engineering information engineering Overhead (computing) Host (network) computer
Zdroj:	IBM Journal of Research and Development. 64:9:1-9:9
ISSN:	0018-8646
Popis:	The U.S. Department of Energy CORAL program systems SUMMIT and SIERRA are based on hybrid servers comprising IBM POWER9 CPUs and NVIDIA V100 graphics processing units (GPUs) connected by two extended data rate (EDR) links to a high-speed InfiniBand Network. A major challenge to the communication software stack is to optimize performance for all combinations of data origin and destination: host or GPU memory, same or different server. Alternate paths exist for routing data from GPU memory. When origin and destination are on different servers, it can be sent either via host memory or bypassing host memory with GPU direct feature. When origin and destination are on the same server, host memory can be bypassed with peer-to-peer inter process communication (IPC). For large messages pipelining makes host memory data path competitive with GPU direct. In this article, we explain the techniques used in Spectrum MPI Parallel Active Message Interface layer to cache memory types and attributes in order to reduce the overhead associated with calling the CUDA application programming interface (API); in addition, we detail the different protocols used for different memory types, device memory, managed memory, and host memory. To illustrate, the caching technique achieved a device-to-device latency improvement of 26% for intranode transfers and 19% for internode transfers.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::65e5a2bebe0712ed2b84686cf12f2bcf https://doi.org/10.1147/jrd.2020.2967311 Zobrazit plný text záznamu