Zobrazeno 1 - 10
of 117
pro vyhledávání: '"Iteration space"'
Autor:
Bielecki Włodzimierz, Pałkowski Marek
Publikováno v:
International Journal of Applied Mathematics and Computer Science, Vol 26, Iss 4, Pp 919-939 (2016)
A novel approach to generation of tiled code for arbitrarily nested loops is presented. It is derived via a combination of the polyhedral and iteration space slicing frameworks. Instead of program transformations represented by a set of affine functi
Externí odkaz:
https://doaj.org/article/16827314348a4537a8be7ed06965f546
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Publikováno v:
DLS
Execution times may be reduced by offloading parallel loop nests to a GPU. Auto-parallelizing compilers are common for static languages, often using a cost model to determine when the GPU execution speed will outweigh the offload overheads. Nowadays
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::f1d9958ea3bddf148e4c1f9a3297efc0
https://eprints.gla.ac.uk/226320/1/226320.pdf
https://eprints.gla.ac.uk/226320/1/226320.pdf
Autor:
Wlodzimierz Bielecki, Marek Palkowski
Publikováno v:
International Journal of Applied Mathematics and Computer Science, Vol 26, Iss 4, Pp 919-939 (2016)
A novel approach to generation of tiled code for arbitrarily nested loops is presented. It is derived via a combination of the polyhedral and iteration space slicing frameworks. Instead of program transformations represented by a set of affine functi
Publikováno v:
ARRAY@PLDI
Rank polymorphism serves as a type of control flow used in array-oriented languages, where functions are automatically lifted to operate on high-dimensional arguments. The iteration space is derived directly from the shape of the data, presenting a c
Autor:
M. Wolfe
Publikováno v:
SC
Subdividing the iteration space of a loop into blocks or tiles with a fixed maximum size has several advantages. Tiles become a natural candidate as the unit of work for parallel task scheduling. Synchronization between processors can be done between
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::75e4a44b49cd4ecb92b7094ae62209c7
Autor:
Patricia Della Mea Plentz, Márcio Castro, Pedro Henrique Penna, Henrique Freitas, François Broquedis, Jean-François Méhaut
Publikováno v:
Simpósio em Sistemas Computacionais de Alto Desempenho
Simpósio em Sistemas Computacionais de Alto Desempenho, Oct 2017, Campinas, Brazil
Simpósio em Sistemas Computacionais de Alto Desempenho, Oct 2017, Campinas, Brazil
National audience; Workload-aware loop schedulers were introduced to deliver better performance than classical strategies, but they present limitations on work-load estimation, chunk scheduling and integrability with applications. Targeting these cha
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::153eb3ba61b10992fc61b41c23c317db
https://hal.archives-ouvertes.fr/hal-01596427
https://hal.archives-ouvertes.fr/hal-01596427
Autor:
Marek Palkowski, Bielecki, W.
Publikováno v:
COMPUTING AND INFORMATICS; Vol 35, No 6 (2016): Computing and Informatics; 1277-1306
Scopus-Elsevier
Scopus-Elsevier
The paper presents a source-to-source compiler, TRACO, for automatic extraction of both coarse- and fine-grained parallelism available in C/C++ loops. Parallelization techniques implemented in TRACO are based on the transitive closure of a relation d
Publikováno v:
ACM Transactions on Architecture and Code Optimization. 11:1-25
Memory management searches for the resources required to store the concurrently alive elements. The solution quality is affected by the representation of the element accesses: a sub-optimal representation leads to overestimation and a non-scalable re
Publikováno v:
ACM Transactions on Design Automation of Electronic Systems. 19:1-30
Storage-size management techniques aim to reduce the resources required to store elements and to concurrently provide efficient addressing during element accessing. Existing techniques are less appropriate for large iteration spaces with increased nu