Zobrazeno 1 - 10
of 209
pro vyhledávání: '"Marc Casas"'
Publikováno v:
Future Generation Computer Systems. 143:152-162
This paper demonstrates that state-of-the-art proposals to compute convolutions on architectures with CPUs supporting SIMD instructions deliver poor performance for long SIMD lengths due to frequent cache conflict misses. We first discuss how to adap
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c7f4517c9f176b65e3411d97755f478e
https://hdl.handle.net/2117/387544
https://hdl.handle.net/2117/387544
Publikováno v:
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Universitat Politècnica de Catalunya (UPC)
Fused Multiply-Add (FMA) functional units constitute a fundamental hardware component to train Deep Neural Networks (DNNs). Its silicon area grows quadratically with the mantissa bit count of the computer number format, which has motivated the adopti
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::823fc95f064879d044fd15401ab7379c
https://hdl.handle.net/2117/373614
https://hdl.handle.net/2117/373614
Autor:
Xavier Buñuel, Teresa Alcoverro, Jordi Boada, Leire Zinkunegi, Timothy M. Smith, Anaïs Barrera, Marc Casas, Simone Farina, Marta Pérez, Javier Romero, Rohan Arthur, Jordi F. Pagès
Publikováno v:
Oikos.
Publikováno v:
Machine Learning and Knowledge Discovery in Databases ISBN: 9783031264184
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::72816afba0efaa54bb1b1807b3aafe31
https://doi.org/10.1007/978-3-031-26419-1_29
https://doi.org/10.1007/978-3-031-26419-1_29
Publikováno v:
2022 IEEE/ACM 7th International Workshop on Extreme Scale Programming Models and Middleware (ESPM2).
Autor:
Robin Kumar Sharma, Marc Casas
Publikováno v:
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Universitat Politècnica de Catalunya (UPC)
This paper proposes a novel parallel execution model for Bidirectional Recurrent Neural Networks (BRNNs), B-Par (Bidirectional-Parallelization), which exploits data and control dependencies for forward and reverse input computations. B-Par divides BR
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::9230f37be0772c87236b902846605c49
https://hdl.handle.net/2117/372842
https://hdl.handle.net/2117/372842
Publikováno v:
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Universitat Politècnica de Catalunya (UPC)
Deep Neural Networks (DNNs) have become ubiquitous in a wide range of application domains. Despite their success, training DNNs is an expensive task that has motivated the use of reduced numerical precision formats to improve performance and reduce p
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::557623162c1e7bbe76208ccfc8931ee2
Autor:
Georgios Vavouliotis, Gino Chacon, Lluc Alvarez, Paul V. Gratz, Daniel A. Jimenez, Marc Casas
The increase in working set sizes of contemporary applications outpaces the growth in cache sizes, resulting in frequent main memory accesses that deteriorate system per- formance due to the disparity between processor and memory speeds. Prefetching
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::2e48390791694b2140c4a854088e4081
https://hdl.handle.net/2117/379247
https://hdl.handle.net/2117/379247
Autor:
Alexandre E. Eichenberger, Cristobal Ortega, Ramon Bertran, Marc Casas, Miquel Moreto, Pradip Bose, Lluc Alvarez, Alper Buyuktosunoglu
Publikováno v:
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Universitat Politècnica de Catalunya (UPC)
Current microprocessors include several knobs to modify the hardware behavior in order to improve performance, power, and energy under different workload demands. An impractical and time consuming offline profiling is needed to evaluate the design sp