Design and analysis of scheduling strategies for multi-CPU and multi-GPU architectures
Autor: | João V. F. Lima, Thierry Gautier, Nicolas Maillard, Vincent Danjean, Bruno Raffin |
---|---|
Přispěvatelé: | PrograMming and scheduling design fOr Applications in Interactive Simulation (MOAIS), Inria Grenoble - Rhône-Alpes, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire d'Informatique de Grenoble (LIG), Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Centre National de la Recherche Scientifique (CNRS)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Centre National de la Recherche Scientifique (CNRS), Instituto de Informática da UFRGS (UFRGS), Universidade Federal do Rio Grande do Sul [Porto Alegre] (UFRGS), Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Centre National de la Recherche Scientifique (CNRS)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Centre National de la Recherche Scientifique (CNRS)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF), Laboratoire d'Informatique de Grenoble (LIG), Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Centre National de la Recherche Scientifique (CNRS)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Centre National de la Recherche Scientifique (CNRS)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF)-Inria Grenoble - Rhône-Alpes, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria) |
Rok vydání: | 2015 |
Předmět: |
Computer Networks and Communications
Computer science Distributed computing Computation Parallel programming Task parallelism 0102 computer and information sciences 02 engineering and technology Parallel computing 01 natural sciences Theoretical Computer Science Scheduling (computing) Data-flow dependencies Artificial Intelligence 0202 electrical engineering electronic engineering information engineering Work stealing Multi gpu 020203 distributed computing Locality Computer Graphics and Computer-Aided Design 010201 computation theory & mathematics Hardware and Architecture Linear algebra [INFO.INFO-DC]Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC] Accelerators Software Cholesky decomposition |
Zdroj: | Parallel Computing Parallel Computing, 2015, 44, pp.37-52. ⟨10.1016/j.parco.2015.03.001⟩ Parallel Computing, Elsevier, 2015, 44, pp.37-52. ⟨10.1016/j.parco.2015.03.001⟩ |
ISSN: | 0167-8191 |
DOI: | 10.1016/j.parco.2015.03.001 |
Popis: | We evaluated four scheduling strategies for multi-CPU and multi-GPU architectures.We designed a framework with performance models for task and transfer prediction.Work stealing is efficient with task annotations and data locality heuristics.HEFT cost model performs better on very regular computations. In this paper, we present a comparison of scheduling strategies for heterogeneous multi-CPU and multi-GPU architectures. We designed and evaluated four scheduling strategies on top of XKaapi runtime: work stealing, data-aware work stealing, locality-aware work stealing, and Heterogeneous Earliest-Finish-Time (HEFT). On a heterogeneous architecture with 12 CPUs and 8 GPUs, we analysed our scheduling strategies with four benchmarks: a BLAS-1 AXPY vector operation, a Jacobi 2D iterative computation, and two linear algebra algorithms Cholesky and LU. We conclude that the use of work stealing may be efficient if task annotations are given along with a data locality strategy. Furthermore, our experimental results suggests that HEFT scheduling performs better on applications with very regular computations and low data locality. |
Databáze: | OpenAIRE |
Externí odkaz: |