Zobrazeno 1 - 10
of 109
pro vyhledávání: '"Gestió de memòria (Informàtica)"'
Programming parallel dense matrix factorizations and inversion for new-generation NUMA architectures
Autor:
Catalán Pallarés, Sandra, Igual Peña, Francisco D., Herrero Zaragoza, José Ramón, Rodríguez Sánchez, Rafael, Quintana Ortí, Enrique Salvador
Publikováno v:
Journal of Parallel and Distributed Computing. 175:51-65
We propose a methodology to address the programmability issues derived from the emergence of new-generation shared-memory NUMA architectures. For this purpose, we employ dense matrix factorizations and matrix inversion (DMFI) as a use case, and we ta
Autor:
Peini Liu, Jordi Guitart
Publikováno v:
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Universitat Politècnica de Catalunya (UPC)
Containerization technology offers an appealing alternative for encapsulating and operating applications (and all their dependencies) without being constrained by the performance penalties of using Virtual Machines and, as a result, has got the inter
GPU architectures have become popular for executing general-purpose programs which rely on having a large number of threads that run concurrently to hide the latency among dependent instructions. This approach has an important cost/overhead in terms
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=od______3484::81089e8a7c426e08ea1cc7ddf6f5e849
https://hdl.handle.net/2117/389957
https://hdl.handle.net/2117/389957
This paper demonstrates that state-of-the-art proposals to compute convolutions on architectures with CPUs supporting SIMD instructions deliver poor performance for long SIMD lengths due to frequent cache conflict misses. We first discuss how to adap
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c7f4517c9f176b65e3411d97755f478e
https://hdl.handle.net/2117/387544
https://hdl.handle.net/2117/387544
Autor:
Carlos Escuin, Asif Ali Khan, Pablo Ibáñez, Teresa Monreal, Jeronimo Castrillon, Víctor Viñals
Emerging non-volatile memory (NVM) technologies can potentially replace large SRAM memories such as the last-level cache (LLC). However, despite recent advances, NVMs suffer from higher write latency and limited write endurance. Recently, NVM-SRAM hy
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::93b5aa1ca81d959f3b1dc25259b531a6
https://hdl.handle.net/2117/387432
https://hdl.handle.net/2117/387432
Autor:
Ali, Omar Shaaban Ibrahim, Aguilar Mena, Jimmy, Beltran Querol, Vicenç, Carpenter, Paul Matthew, Ayguadé Parra, Eduard, Labarta Mancho, Jesús José
Publikováno v:
2022 IEEE 34th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD).
Task-based programming is a high performance and productive model to express parallelism. Tasks encapsulate work to be executed across multiple cores or offloaded to GPUs, FPGAs, other accelerators or other nodes. In order to maintain parallelism and
Autor:
Mateu Barriendos, Elia
Contrary to popular assumption, Static RAM (SRAM), the main memory in most modern microcontrollers, temporarily retains its contents after power is lost. Instead of an immediate erase, SRAM data progressively degrades over a period (from milliseconds
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=od______3484::f2123e50418f458d141f6f505642ddf7
https://hdl.handle.net/2117/375775
https://hdl.handle.net/2117/375775
Publikováno v:
2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software
In this work, we extend the auto-tuning process of the state-of-the-art TVM framework with XFeatur; a tool that extracts new meaningful hardware-related features that improve the quality of the representation of the search space and consequently impr
Autor:
Alejandro J. Calderon, Leonidas Kosmidis, Peio Onaindia, Carlos F. Nicolás, Francisco J. Cazorla
Publikováno v:
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Universitat Politècnica de Catalunya (UPC)
Critical real-time systems require strict resource provisioning in terms of memory and timing. The constant need for higher performance in these systems has led industry to recently include GPUs. However, GPU software ecosystems are by their nature c
Publikováno v:
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Universitat Politècnica de Catalunya (UPC)
Object stores are widely used software stacks that achieve excellent scale-out with a well-defined interface and robust performance. However, their traditional get/put interface is unable to exploit data locality at its fullest, and limits reaching i
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::253db11af62e0a384b434895597e3131
https://hdl.handle.net/2117/356407
https://hdl.handle.net/2117/356407