Design and analysis of scheduling strategies for multi-CPU and multi-GPU architectures

Autor:	João V. F. Lima, Thierry Gautier, Nicolas Maillard, Vincent Danjean, Bruno Raffin
Přispěvatelé:	PrograMming and scheduling design fOr Applications in Interactive Simulation (MOAIS), Inria Grenoble - Rhône-Alpes, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire d'Informatique de Grenoble (LIG), Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Centre National de la Recherche Scientifique (CNRS)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Centre National de la Recherche Scientifique (CNRS), Instituto de Informática da UFRGS (UFRGS), Universidade Federal do Rio Grande do Sul [Porto Alegre] (UFRGS), Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Centre National de la Recherche Scientifique (CNRS)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Centre National de la Recherche Scientifique (CNRS)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF), Laboratoire d'Informatique de Grenoble (LIG), Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Centre National de la Recherche Scientifique (CNRS)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Centre National de la Recherche Scientifique (CNRS)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF)-Inria Grenoble - Rhône-Alpes, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)
Rok vydání:	2015
Předmět:	Computer Networks and Communications Computer science Distributed computing Computation Parallel programming Task parallelism 0102 computer and information sciences 02 engineering and technology Parallel computing 01 natural sciences Theoretical Computer Science Scheduling (computing) Data-flow dependencies Artificial Intelligence 0202 electrical engineering electronic engineering information engineering Work stealing Multi gpu 020203 distributed computing Locality Computer Graphics and Computer-Aided Design 010201 computation theory & mathematics Hardware and Architecture Linear algebra [INFO.INFO-DC]Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC] Accelerators Software Cholesky decomposition
Zdroj:	Parallel Computing Parallel Computing, 2015, 44, pp.37-52. ⟨10.1016/j.parco.2015.03.001⟩ Parallel Computing, Elsevier, 2015, 44, pp.37-52. ⟨10.1016/j.parco.2015.03.001⟩
ISSN:	0167-8191
DOI:	10.1016/j.parco.2015.03.001
Popis:	We evaluated four scheduling strategies for multi-CPU and multi-GPU architectures.We designed a framework with performance models for task and transfer prediction.Work stealing is efficient with task annotations and data locality heuristics.HEFT cost model performs better on very regular computations. In this paper, we present a comparison of scheduling strategies for heterogeneous multi-CPU and multi-GPU architectures. We designed and evaluated four scheduling strategies on top of XKaapi runtime: work stealing, data-aware work stealing, locality-aware work stealing, and Heterogeneous Earliest-Finish-Time (HEFT). On a heterogeneous architecture with 12 CPUs and 8 GPUs, we analysed our scheduling strategies with four benchmarks: a BLAS-1 AXPY vector operation, a Jacobi 2D iterative computation, and two linear algebra algorithms Cholesky and LU. We conclude that the use of work stealing may be efficient if task annotations are given along with a data locality strategy. Furthermore, our experimental results suggests that HEFT scheduling performs better on applications with very regular computations and low data locality.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::5b7b9b66fe87dc55be1385ebb96e2a24 https://doi.org/10.1016/j.parco.2015.03.001 Zobrazit plný text záznamu Full Text from ScienceDirect