FusedMM: A Unified SDDMM-SpMM Kernel for Graph Embedding and Graph Neural Networks
Autor: | Md. Khaledur Rahman, Ariful Azad, Majedul Haque Sujon |
---|---|
Rok vydání: | 2021 |
Předmět: |
Social and Information Networks (cs.SI)
FOS: Computer and information sciences Computer Science - Machine Learning Source code Computer science Graph embedding media_common.quotation_subject Computer Science - Social and Information Networks Memory bandwidth Parallel computing Load balancing (computing) Matrix multiplication Machine Learning (cs.LG) Memory management Computer Science - Distributed Parallel and Cluster Computing Kernel (image processing) Graph (abstract data type) Distributed Parallel and Cluster Computing (cs.DC) media_common |
Zdroj: | IPDPS |
Popis: | We develop a fused matrix multiplication kernel that unifies sampled dense-dense matrix multiplication and sparse-dense matrix multiplication under a single operation called FusedMM. By using user-defined functions, FusedMM can capture almost all computational patterns needed by popular graph embedding and GNN approaches. FusedMM is an order of magnitude faster than its equivalent kernels in Deep Graph Library. The superior performance of FusedMM comes from the low-level vectorized kernels, a suitable load balancing scheme and an efficient utilization of the memory bandwidth. FusedMM can tune its performance using a code generator and perform equally well on Intel, AMD and ARM processors. FusedMM speeds up an end-to-end graph embedding algorithm by up to 28x on different processors. Comment: 11 pages, published in IEEE IPDPS 2021 |
Databáze: | OpenAIRE |
Externí odkaz: |