PIM-Tree

Autor:	Kang, Hongbo, Zhao, Yiwei, Blelloch, Guy E., Dhulipala, Laxman, Gu, Yan, McGuffey, Charles, Gibbons, Phillip B.
Rok vydání:	2022
Předmět:	Performance (cs.PF) FOS: Computer and information sciences Computer Science - Performance Computer Science - Databases Computer Science - Distributed Parallel and Cluster Computing H.2.4 Computer Science - Data Structures and Algorithms General Engineering Databases (cs.DB) Data Structures and Algorithms (cs.DS) Distributed Parallel and Cluster Computing (cs.DC) 68P05
Zdroj:	Proceedings of the VLDB Endowment. 16:946-958
ISSN:	2150-8097
DOI:	10.14778/3574245.3574275
Popis:	The performance of today's in-memory indexes is bottlenecked by the memory latency/bandwidth wall. Processing-in-memory (PIM) is an emerging approach that potentially mitigates this bottleneck, by enabling low-latency memory access whose aggregate memory bandwidth scales with the number of PIM nodes. There is an inherent tension, however, between minimizing inter-node communication and achieving load balance in PIM systems, in the presence of workload skew. This paper presents PIM-tree , an ordered index for PIM systems that achieves both low communication and high load balance, regardless of the degree of skew in data and queries. Our skew-resistant index is based on a novel division of labor between the host CPU and PIM nodes, which leverages the strengths of each. We introduce push-pull search , which dynamically decides whether to push queries to a PIM-tree node or pull the node's keys back to the CPU based on workload skew. Combined with other PIM-friendly optimizations ( shadow subtrees and chunked skip lists ), our PIM-tree provides high-throughput, (guaranteed) low communication, and (guaranteed) high load balance, for batches of point queries, updates, and range scans. We implement PIM-tree, in addition to prior proposed PIM indexes, on the latest PIM system from UPMEM, with 32 CPU cores and 2048 PIM nodes. On workloads with 500 million keys and batches of 1 million queries, the throughput using PIM-trees is up to 69.7X and 59.1x higher than the two best prior PIM-based methods. As far as we know these are the first implementations of an ordered index on a real PIM system.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::24f92d6ca87072e76c8f5eaea3fb98b6 https://doi.org/10.14778/3574245.3574275 Zobrazit plný text záznamu