Zobrazeno 1 - 4
of 4
pro vyhledávání: '"Thibaut Lutz"'
Publikováno v:
GPGPU@PPoPP
Remmelg, T, Lutz, T, Steuwer, M & Dubach, C 2016, Performance Portable GPU Code Generation for Matrix Multiplication . in GPGPU-9 General-Purpose GPU Workshop . pp. 22-31, General-Purpose GPU Workshop, Barcelona, Spain, 12/03/16 . https://doi.org/10.1145/2884045.2884046
Remmelg, T, Lutz, T, Steuwer, M & Dubach, C 2016, Performance Portable GPU Code Generation for Matrix Multiplication . in GPGPU-9 General-Purpose GPU Workshop . pp. 22-31, General-Purpose GPU Workshop, Barcelona, Spain, 12/03/16 . https://doi.org/10.1145/2884045.2884046
Parallel accelerators such as GPUs are notoriously hard to program; exploiting their full performance potential is a job best left for ninja programmers. High-level programming languages coupled with optimizing compilers have been proposed to attempt
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::a4edc57e6d6c755cbc7947f001ba5cfc
https://eprints.gla.ac.uk/146600/7/146600.pdf
https://eprints.gla.ac.uk/146600/7/146600.pdf
Publikováno v:
GPGPU@PPoPP
Lutz, T, Fensch, C & Cole, M 2015, Helium: a transparent inter-kernel optimizer for OpenCL . in GPGPU 2015 Proceedings of the 8th Workshop on General Purpose Processing using GPUs . pp. 70-80 . https://doi.org/10.1145/2716282.2716284
Lutz, T, Fensch, C & Cole, M 2015, Helium: a transparent inter-kernel optimizer for OpenCL . in GPGPU 2015 Proceedings of the 8th Workshop on General Purpose Processing using GPUs . pp. 70-80 . https://doi.org/10.1145/2716282.2716284
State of the art automatic optimization of OpenCL applications focuses on improving the performance of individual compute kernels. Programmers address opportunities for inter-kernel optimization in specific applications by ad-hoc hand tuning: manuall
Autor:
Vinod Grover, Thibaut Lutz
Publikováno v:
FHPC@ICFP
C++11 introduced a set of new features to extend the core language and the standard library. Amongst the new features are basic blocks for concurrency management like threads and atomic operation support, and a new syntax to declare single purpose, o
Publikováno v:
Lutz, T, Fensch, C & Cole, M 2013, ' PARTANS : An autotuning framework for stencil computation on multi-GPU systems ', ACM Transactions on Architecture and Code Optimization, vol. 9, no. 4, 59 . https://doi.org/10.1145/2400682.2400718
ACM Transactions on Architecture and Code Optimization
ACM Transactions on Architecture and Code Optimization
GPGPUs are a powerful and energy-efficient solution for many problems. For higher performance or larger problems, it is necessary to distribute the problem across multiple GPUs, increasing the already high programming complexity. In this article, we
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::bae4f652e706b5ce2a2f231fa608ac01
https://hdl.handle.net/20.500.11820/82cb78c9-afcd-4e38-9379-b81e3fb92174
https://hdl.handle.net/20.500.11820/82cb78c9-afcd-4e38-9379-b81e3fb92174