Improving the performance of classical linear algebra iterative methods via hybrid parallelism
Autor: | Pedro J. Martinez-Ferrer, Tufan Arslan, Vicenç Beltran |
---|---|
Přispěvatelé: | Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Barcelona Supercomputing Center, Universitat Politècnica de Catalunya. PM - Programming Models |
Rok vydání: | 2023 |
Předmět: |
FOS: Computer and information sciences
Algebras Linear Computer Networks and Communications G.1.3 Distributed-memory Theoretical Computer Science Shared-memory Artificial Intelligence Computer Science - Data Structures and Algorithms Data Structures and Algorithms (cs.DS) Linear algebra Informàtica::Arquitectura de computadors::Arquitectures paral·leles [Àrees temàtiques de la UPC] 15-04 Computer Science - Performance Parallel processing (Electronic computers) Processament en paral·lel (Ordinadors) I.6.3 Hybrid parallelism Informàtica::Informàtica teòrica::Algorísmica i teoria de la complexitat [Àrees temàtiques de la UPC] Performance (cs.PF) Computer Science - Distributed Parallel and Cluster Computing Hardware and Architecture MPI Distributed Parallel and Cluster Computing (cs.DC) Àlgebra lineal Software |
DOI: | 10.48550/arxiv.2305.05988 |
Popis: | We propose fork-join and task-based hybrid implementations of four classical linear algebra iterative methods (Jacobi, Gauss-Seidel, conjugate gradient and biconjugate gradient stabilised) as well as variations of them. Algorithms are duly documented and the corresponding source code is made publicly available for reproducibility. Both weak and strong scalability benchmarks are conducted to statistically analyse their relative efficiencies. The weak scalability results assert the superiority of a task-based hybrid parallelisation over MPI-only and fork-join hybrid implementations. Indeed, the task-based model is able to achieve speedups of up to 25% larger than its MPI-only counterpart depending on the numerical method and the computational resources used. For strong scalability scenarios, hybrid methods based on tasks remain more efficient with moderate computational resources where data locality does not play an important role. Fork-join hybridisation often yields mixed results and hence does not present a competitive advantage over a much simpler MPI approach. Comment: 33 pages, 6 figures, accepted manuscript in Journal of Parallel and Distributed Computing |
Databáze: | OpenAIRE |
Externí odkaz: |