Improving the performance of classical linear algebra iterative methods via hybrid parallelism

Autor:	Pedro J. Martinez-Ferrer, Tufan Arslan, Vicenç Beltran
Přispěvatelé:	Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Barcelona Supercomputing Center, Universitat Politècnica de Catalunya. PM - Programming Models
Rok vydání:	2023
Předmět:	FOS: Computer and information sciences Algebras Linear Computer Networks and Communications G.1.3 Distributed-memory Theoretical Computer Science Shared-memory Artificial Intelligence Computer Science - Data Structures and Algorithms Data Structures and Algorithms (cs.DS) Linear algebra Informàtica::Arquitectura de computadors::Arquitectures paral·leles [Àrees temàtiques de la UPC] 15-04 Computer Science - Performance Parallel processing (Electronic computers) Processament en paral·lel (Ordinadors) I.6.3 Hybrid parallelism Informàtica::Informàtica teòrica::Algorísmica i teoria de la complexitat [Àrees temàtiques de la UPC] Performance (cs.PF) Computer Science - Distributed Parallel and Cluster Computing Hardware and Architecture MPI Distributed Parallel and Cluster Computing (cs.DC) Àlgebra lineal Software
DOI:	10.48550/arxiv.2305.05988
Popis:	We propose fork-join and task-based hybrid implementations of four classical linear algebra iterative methods (Jacobi, Gauss-Seidel, conjugate gradient and biconjugate gradient stabilised) as well as variations of them. Algorithms are duly documented and the corresponding source code is made publicly available for reproducibility. Both weak and strong scalability benchmarks are conducted to statistically analyse their relative efficiencies. The weak scalability results assert the superiority of a task-based hybrid parallelisation over MPI-only and fork-join hybrid implementations. Indeed, the task-based model is able to achieve speedups of up to 25% larger than its MPI-only counterpart depending on the numerical method and the computational resources used. For strong scalability scenarios, hybrid methods based on tasks remain more efficient with moderate computational resources where data locality does not play an important role. Fork-join hybridisation often yields mixed results and hence does not present a competitive advantage over a much simpler MPI approach. Comment: 33 pages, 6 figures, accepted manuscript in Journal of Parallel and Distributed Computing
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::eb9205c05f1c7c67ca0766908f115d3d Zobrazit plný text záznamu