Zobrazeno 1 - 10
of 51
pro vyhledávání: '"Christian Fensch"'
Publikováno v:
Metzger, P, Seeker, V, Fensch, C & Cole, M 2021, ' Device-Hopping: Transparent Mid-Kernel Runtime Switching for Heterogeneous Systems ', ACM Transactions on Architecture and Code Optimization, vol. 18, no. 4, 57 . https://doi.org/10.1145/3471909
Existing OS techniques for homogeneous many-core systems make it simple for single and multithreaded applications to migrate between cores. Heterogeneous systems do not benefit so fully from this flexibility, and applications that cannot migrate in m
Publikováno v:
RTAS
Metzger, P, Cole, M, Fensch, C, Aldinucci, M & Bini, E 2020, Enforcing Deadlines for Skeleton-based Parallel Programming . in 2020 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS) . Institute of Electrical and Electronics Engineers (IEEE), pp. 188-199, 26th IEEE Real-Time and Embedded Technology and Applications Symposium, Sydney, New South Wales, Australia, 21/04/20 . https://doi.org/10.1109/RTAS48715.2020.000-7
Metzger, P, Cole, M, Fensch, C, Aldinucci, M & Bini, E 2020, Enforcing Deadlines for Skeleton-based Parallel Programming . in 2020 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS) . Institute of Electrical and Electronics Engineers (IEEE), pp. 188-199, 26th IEEE Real-Time and Embedded Technology and Applications Symposium, Sydney, New South Wales, Australia, 21/04/20 . https://doi.org/10.1109/RTAS48715.2020.000-7
High throughput applications with real-time guarantees are increasingly relevant. For these applications, parallelism must be exposed to meet deadlines. Directed Acyclic Graphs (DAGs) are a popular and very general application model that can capture
Publikováno v:
Lecture Notes in Computer Science ISBN: 9783030105488
Euro-Par Workshops
Euro-Par Workshops
Resilience for HPC applications typically is implemented as a CPU-based rollback-recovery technique. In this context, long running accelerator computations on GPUs pose a major challenge as these devices usually do not offer any means of interrupt. T
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::f708618ecba67d20ce5cb1f9d835d9b5
https://doi.org/10.1007/978-3-030-10549-5_64
https://doi.org/10.1007/978-3-030-10549-5_64
Publikováno v:
Euro-Par 2018: Parallel Processing ISBN: 9783319969824
Euro-Par
Euro-Par
To address NUMA performance anomalies, programmers often resort to application specific optimizations that are not transferable to other programs, or to generic optimizations that do not perform well in all cases. Skeleton based programming models al
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::f261a14b498d91e5105ad9028d140913
https://doi.org/10.1007/978-3-319-96983-1_42
https://doi.org/10.1007/978-3-319-96983-1_42
Publikováno v:
IEEE Transactions on Computers
Many-core architectures provide an efficient way of harnessing the growing numbers of transistors available. However, energy and latency costs of communication increasingly limit the parallel programs running on these platforms. Existing designs prov
Publikováno v:
Steuwer, M, Fensch, C, Lindley, S & Dubach, C 2015, Generating Performance Portable Code using Rewrite Rules: From High-Level Functional Expressions to High-Performance OpenCL Code . in Proceedings of the 20th ACM SIGPLAN International Conference on Functional Programming . ACM SIGPLAN Notices, no. 9, vol. 50, Vancouver, BC, Canada, pp. 205-217, 20th ACM SIGPLAN International Conference on Functional Programming, Vancouver, British Columbia, Canada, 31/08/15 . https://doi.org/10.1145/2784731.2784754
ICFP
ICFP
Computers have become increasingly complex with the emergence of heterogeneous hardware combining multicore CPUs and GPUs. These parallel systems exhibit tremendous computational power at the cost of increased programming effort resulting in a tensio
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::349389893fbd65cfb030e7b36bc95d80
https://eprints.gla.ac.uk/146605/7/146605.pdf
https://eprints.gla.ac.uk/146605/7/146605.pdf
Publikováno v:
ROSS@HPDC
Collins, A, Harris, T, Cole, M & Fensch, C 2015, LIRA: Adaptive Contention-Aware Thread Placement for Parallel Runtime Systems . in ROSS '15 Proceedings of the 5th International Workshop on Runtime and Operating Systems for Supercomputers ., 2 . https://doi.org/10.1145/2768405.2768407
Collins, A, Harris, T, Cole, M & Fensch, C 2015, LIRA: Adaptive Contention-Aware Thread Placement for Parallel Runtime Systems . in ROSS '15 Proceedings of the 5th International Workshop on Runtime and Operating Systems for Supercomputers ., 2 . https://doi.org/10.1145/2768405.2768407
Running multiple parallel programs on multi-socket multi-core machines using commodity hardware is increasingly common for data analytics and cluster workloads. These workloads exhibit bursty behavior and are rarely tuned to specific hardware. This l
Publikováno v:
GPGPU@PPoPP
Lutz, T, Fensch, C & Cole, M 2015, Helium: a transparent inter-kernel optimizer for OpenCL . in GPGPU 2015 Proceedings of the 8th Workshop on General Purpose Processing using GPUs . pp. 70-80 . https://doi.org/10.1145/2716282.2716284
Lutz, T, Fensch, C & Cole, M 2015, Helium: a transparent inter-kernel optimizer for OpenCL . in GPGPU 2015 Proceedings of the 8th Workshop on General Purpose Processing using GPUs . pp. 70-80 . https://doi.org/10.1145/2716282.2716284
State of the art automatic optimization of OpenCL applications focuses on improving the performance of individual compute kernels. Programmers address opportunities for inter-kernel optimization in specific applications by ad-hoc hand tuning: manuall
Autor:
Christian Fensch, Michael O'Boyle
Publikováno v:
PACT
We welcome you to the 22nd International Conference on Parallel Architectures and Compilation Techniques - PACT'13 at the Surgeons' Hall, Edinburgh, Scotland, UK.
Publikováno v:
Lutz, T, Fensch, C & Cole, M 2013, ' PARTANS : An autotuning framework for stencil computation on multi-GPU systems ', ACM Transactions on Architecture and Code Optimization, vol. 9, no. 4, 59 . https://doi.org/10.1145/2400682.2400718
ACM Transactions on Architecture and Code Optimization
ACM Transactions on Architecture and Code Optimization
GPGPUs are a powerful and energy-efficient solution for many problems. For higher performance or larger problems, it is necessary to distribute the problem across multiple GPUs, increasing the already high programming complexity. In this article, we
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::bae4f652e706b5ce2a2f231fa608ac01
https://hdl.handle.net/20.500.11820/82cb78c9-afcd-4e38-9379-b81e3fb92174
https://hdl.handle.net/20.500.11820/82cb78c9-afcd-4e38-9379-b81e3fb92174