Node-Type-Based Load-Balancing Routing for Parallel Generalized Fat-Trees

Autor: Gliksberg, John, Quintin, Jean-Noel, Garcia, Pedro Javier
Rok vydání: 2022
Předmět:
Zdroj: 2018 IEEE 4th International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB), Feb 2018, Vienna, France. pp.9-15
Druh dokumentu: Working Paper
DOI: 10.1109/HiPINEB.2018.00010
Popis: High-Performance Computing (HPC) clusters are made up of a variety of node types (usually compute, I/O, service, and GPGPU nodes) and applications don't use nodes of a different type the same way. Resulting communication patterns reflect organization of groups of nodes, and current optimal routing algorithms for all-to-all patterns will not always maximize performance for group-specific communications. Since application communication patterns are rarely available beforehand, we choose to rely on node types as a good guess for node usage. We provide a description of node type heterogeneity and analyse performance degradation caused by unlucky repartition of nodes of the same type. We provide an extension to routing algorithms for Parallel Generalized Fat-Tree topologies (PGFTs) which balances load amongst groups of nodes of the same type. We show how it removes these performance issues by comparing results in a variety of situations against corresponding classical algorithms.
Databáze: arXiv