Výsledky vyhledávání

Report

Toward Performance-Portable PETSc for GPU-based Exascale Systems

Autor: Mills, Richard Tran, Adams, Mark F., Balay, Satish, Brown, Jed, Dener, Alp, Knepley, Matthew, Kruger, Scott E., Morgan, Hannah, Munson, Todd, Rupp, Karl, Smith, Barry F., Zampini, Stefano, Zhang, Hong, Zhang, Junchao

The Portable Extensible Toolkit for Scientific computation (PETSc) library delivers scalable solvers for nonlinear time-dependent differential and algebraic equations and for numerical optimization.The PETSc design for performance portability address

Externí odkaz: http://arxiv.org/abs/2011.00715

Zobrazit plný text záznamu

Akademický článek

Toward performance-portable PETSc for GPU-based exascale systems

Publikováno v: In Parallel Computing December 2021 108

Zobrazit plný text záznamu

Akademický článek

Preparing sparse solvers for exascale computing

Autor: Anzt, Hartwig, Boman, Erik, Falgout, Rob, Ghysels, Pieter, Heroux, Michael, Li, Xiaoye, McInnes, Lois Curfman, Mills, Richard Tran, Rajamanickam, Sivasankaran, Rupp, Karl, Smith, Barry, Yamazaki, Ichitaro, Yang, Ulrike Meier

Publikováno v: Philosophical Transactions: Mathematical, Physical and Engineering Sciences, 2020 Mar 01. 378(2166), 1-17.

Externí odkaz: https://www.jstor.org/stable/26917447

Zobrazit plný text záznamu

Report

Finite Element Integration with Quadrature on the GPU

Autor: Knepley, Matthew G., Rupp, Karl, Terrel, Andy R.

We present a novel, quadrature-based finite element integration method for low-order elements on GPUs, using a pattern we call \textit{thread transposition} to avoid reductions while vectorizing aggressively. On the NVIDIA GTX580, which has a nominal

Externí odkaz: http://arxiv.org/abs/1607.04245

Zobrazit plný text záznamu

Report

Extreme-scale Multigrid Components within PETSc

Autor: May, Dave A., Sanan, Patrick, Rupp, Karl, Knepley, Matthew G., Smith, Barry F.

Elliptic partial differential equations (PDEs) frequently arise in continuum descriptions of physical processes relevant to science and engineering. Multilevel preconditioners represent a family of scalable techniques for solving discrete PDEs of thi

Externí odkaz: http://arxiv.org/abs/1604.07163

Zobrazit plný text záznamu

Report

Pipelined Iterative Solvers with Kernel Fusion for Graphics Processing Units

Autor: Rupp, Karl, Weinbub, Josef, Jüngel, Ansgar, Grasser, Tibor

Publikováno v: ACM Transactions on Mathematical Software (TOMS), Volume 43, Issue 2, Article No. 11 (2016)

We revisit the implementation of iterative solvers on discrete graphics processing units and demonstrate the benefit of implementations using extensive kernel fusion for pipelined formulations over conventional implementations of classical formulatio

Externí odkaz: http://arxiv.org/abs/1410.4054

Zobrazit plný text záznamu

Report

Performance Portability Study of Linear Algebra Kernels in OpenCL

Autor: Rupp, Karl, Tillet, Philippe, Rudolf, Florian, Weinbub, Josef, Grasser, Tibor, Jüngel, Ansgar

Publikováno v: Proceedings of the International Workshop on OpenCL 2013 & 2014 (IWOCL)

The performance portability of OpenCL kernel implementations for common memory bandwidth limited linear algebra operations across different hardware generations of the same vendor as well as across vendors is studied. Certain combinations of kernel i

Externí odkaz: http://arxiv.org/abs/1409.0669

Zobrazit plný text záznamu

Report

Achieving High Performance with Unified Residual Evaluation

Autor: Knepley, Matthew G., Brown, Jed, Rupp, Karl, Smith, Barry F.

We examine residual evaluation, perhaps the most basic operation in numerical simulation. By raising the level of abstraction in this operation, we can eliminate specialized code, enable optimization, and greatly increase the extensibility of existin

Externí odkaz: http://arxiv.org/abs/1309.1204

Zobrazit plný text záznamu

Report

Programming CUDA and OpenCL: A Case Study Using Modern C++ Libraries

Autor: Demidov, Denis, Ahnert, Karsten, Rupp, Karl, Gottschling, Peter

We present a comparison of several modern C++ libraries providing high-level interfaces for programming multi- and many-core architectures on top of CUDA or OpenCL. The comparison focuses on the solution of ordinary differential equations and is base

Externí odkaz: http://arxiv.org/abs/1212.6326

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání