Výsledky vyhledávání - "Cedric Nugteren"

On the Anatomy of Predictive Models for Accelerating GPU Convolution Kernels and Beyond

Autor: Grigori Fursin, Anton Lokhmotov, Bruno Carpentieri, Fabiana Zollo, Marco Cianfriglia, Damiano Perri, Osvaldo Gervasi, Paolo Sylos Labini, Cedric Nugteren, Flavio Vella

Publikováno v: ACM Transactions on Architecture and Code Optimization. 18:1-24

Efficient HPC libraries often expose multiple tunable parameters, algorithmic implementations, or a combination of them, to provide optimized routines. The optimal parameters and algorithmic choices may depend on input properties such as the shapes o

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::b3323c748ad929ec608f2d21ffb6c3eb
https://doi.org/10.1145/3434402

Zobrazit plný text záznamu

EL-GAN: Embedding Loss Driven Generative Adversarial Networks for Lane Detection

Autor: Michael Hofmann, Nora Baka, Cedric Nugteren, Mohsen Ghafoorian, Olaf Booij

Publikováno v: Lecture Notes in Computer Science ISBN: 9783030110086
ECCV Workshops (1)

Convolutional neural networks have been successfully applied to semantic segmentation problems. However, there are many problems that are inherently not pixel-wise classification problems but are nevertheless frequently formulated as semantic segment

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::f9cd66531516756edb8ebb740277b4e2
https://doi.org/10.1007/978-3-030-11009-3_15

Zobrazit plný text záznamu

CLTune: A Generic Auto-Tuner for OpenCL Kernels

Autor: Valeriu Codreanu, Cedric Nugteren

This work presents CLTune, an auto-tuner for OpenCL kernels. It evaluates and tunes kernel performance of a generic, user-defined search space of possible parameter-value combinations. Example parameters include the OpenCL workgroup size, vector data

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::fe79b10603275e0cef9ea2c3c3e0ae71
http://arxiv.org/abs/1703.06503

Zobrazit plný text záznamu

CLBlast: A Tuned OpenCL BLAS Library

Autor: Cedric Nugteren

Publikováno v: IWOCL

This work introduces CLBlast, an open-source BLAS library providing optimized OpenCL routines to accelerate dense linear algebra for a wide variety of devices. It is targeted at machine learning and HPC applications and thus provides a fast matrix-mu

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::6903e9c701c28a306722862f72b17595

Zobrazit plný text záznamu

Algorithmic species : a classification of affine loop nests for parallel programming

Autor: Cedric Nugteren, Henk Corporaal, Pieter Custers

Publikováno v: ACM Transactions on Architecture and Code Optimization, 9(4):40, 1-25. Association for Computing Machinery, Inc
ACM Transactions on Architecture and Code Optimization

Code generation and programming have become ever more challenging over the last decade due to the shift towards parallel processing. Emerging processor architectures such as multi-cores and GPUs exploit increasingly parallelism, requiring programmers

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::0579c2744b5fd7cb7bcda5b71e6136f3
https://doi.org/10.1145/2400682.2400699

Zobrazit plný text záznamu

(AS)2: Accelerator Synthesis using Algorithmic Skeletons for Rapid Design Space Exploration

Autor: Shakith Fernando, Mark Wijtvliet, Cedric Nugteren, Akash Kumar, Henk Corporaal

Publikováno v: Design, Automation & Test in Europe Conference & Exhibition (DATE), 2015.

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::8cb2ea0ffc7fde456576a0d5fae007c4
https://doi.org/10.7873/date.2015.0015

Zobrazit plný text záznamu

Bones : an automatic skeleton-based C-to-CUDA compiler for GPUs

Autor: Cedric Nugteren, Henk Corporaal

Publikováno v: ACM Transactions on Architecture and Code Optimization, 11(4):35. Association for Computing Machinery, Inc

The shift toward parallel processor architectures has made programming and code generation increasingly challenging. To address this programmability challenge, this article presents a technique to fully automatically generate efficient and readable c

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::49430cf88ee6f7593a977ef265d66a48
https://research.tue.nl/nl/publications/814d3f4d-1ac2-408c-827a-6b1bf4d567e2

Zobrazit plný text záznamu

A detailed GPU cache model based on reuse distance theory

Autor: Henk Corporaal, Cedric Nugteren, Henri E. Bal, Gert-Jan van den Braak

Publikováno v: 20th IEEE Int. Symp. on High Performance Computer Architecture (HPCA-2014)
HPCA
Vrije Universiteit Amsterdam
Nugteren, C, van den Braak, G-J, Corporaal, H & Bal, H E 2014, A Detailed GPU Cache Model Based on Reuse Distance Theory . in 20th IEEE Int. Symp. on High Performance Computer Architecture (HPCA-2014) . IEEE CS .
Proceedings of the IEEE 20th International Symposium on High Performance Computer Architecture (HPCA), 15-19 February 2014, Orlando, Florida, 37-48
STARTPAGE=37;ENDPAGE=48;TITLE=Proceedings of the IEEE 20th International Symposium on High Performance Computer Architecture (HPCA), 15-19 February 2014, Orlando, Florida

As modern GPUs rely partly on their on-chip memories to counter the imminent off-chip memory wall, the efficient use of their caches has become important for performance and energy. However, optimising cache locality system-atically requires insight

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::a8eb6190a206b467b499de9d7780695b
https://research.tue.nl/en/publications/66653e1d-2ce7-44ab-8543-744cd8fa4d3b

Zobrazit plný text záznamu

Algorithmic species revisited : a program code classification based on array references

Autor: Rosilde Corvino, Cedric Nugteren, Henk Corporaal

Publikováno v: Proceedings of MuCoCoS-6: Internation Workshop on Multi-/Many-core Computing Systems, 7 September 2013, Edinburgh, Scotland, UK, 1-8
STARTPAGE=1;ENDPAGE=8;TITLE=Proceedings of MuCoCoS-6: Internation Workshop on Multi-/Many-core Computing Systems, 7 September 2013, Edinburgh, Scotland, UK

The shift towards parallel processor architectures has made programming, performance prediction and code generation increasingly challenging. Abstract representations of program code (i.e. classifications) have been introduced to address this challen

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::a3276559efe099cb49be5619df99e64f
https://doi.org/10.1109/MuCoCoS.2013.6633604

Zobrazit plný text záznamu

Automatic skeleton-based compilation through integration with an algorithm classification

Autor: Cedric Nugteren, Pieter Custers, Henk Corporaal

Publikováno v: Advanced parallel processing technologies : 10th international symposium, APPT 2013, Stockholm, Sweden, August 27-28, 2013 : revised selected papers, 184-198
STARTPAGE=184;ENDPAGE=198;TITLE=Advanced parallel processing technologies : 10th international symposium, APPT 2013, Stockholm, Sweden, August 27-28, 2013 : revised selected papers
Lecture Notes in Computer Science ISBN: 9783642452925
APPT

This paper presents a technique to fully automatically generate efficient and readable code for parallel processors. We base our approach on skeleton-based compilation and 'algorithmic species', an algorithm classification of program code. We use a t

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::6415a103b0c9372317e7db2bc725e5cd
https://research.tue.nl/nl/publications/2d7b1048-8f2b-4726-b169-4190e2d5b654

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání