Zobrazeno 1 - 10
of 351
pro vyhledávání: '"Processament en paral·lel (Ordinadors)"'
Publikováno v:
The Journal of Supercomputing.
Processors with multiple sockets or chiplets are becoming more conventional. These kinds of processors usually expose a single shared address space. However, due to hardware restrictions, they adopt a NUMA approach, where each processor accesses loca
Autor:
Soria Pardos, Víctor, Armejach Sanosa, Adrià, Mück, Tiago, Suárez Gracía, Dario, Joao, Jose A., Rico, Alejandro, Moreto Planas, Miquel
With increasing core counts in modern multi-core designs, the overhead of synchronization jeopardizes the scalability and efficiency of parallel applications. To mitigate these overheads, modern cache-coherent protocols offer support for Atomic Memor
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=od______3484::c8887afabdf6bff87332c29de5fb139b
https://hdl.handle.net/2117/390752
https://hdl.handle.net/2117/390752
Various kinds of applications take advantage of GPUs through automation tools that attempt to automatically exploit the available performance of the GPU's parallel architecture. Directive-based programming models, such as OpenACC, are one such method
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::b28a27200c92d5a59b3fc0319f73ef95
https://hdl.handle.net/2117/384604
https://hdl.handle.net/2117/384604
Publikováno v:
Euro-Par 2022: Parallel Processing Workshops ISBN: 9783031312083
Scientific applications are large and complex; task-based programming models are a popular approach to developing these applications due to their ease of programming and ability to handle complex workflows and distribute their workload across large i
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::793aa3c01f734f89df8fa4d25b4060ca
https://doi.org/10.1007/978-3-031-31209-0_19
https://doi.org/10.1007/978-3-031-31209-0_19
Autor:
López Paradís, Guillem, Li, Brian, Armejach Sanosa, Adrià, Wallentowitz, Stefan, Moreto Planas, Miquel, Balkind, Jonathan
Chips with tens of billions of transistors have become today's norm. These designs are straining our electronic design automation tools throughout the design process, requiring ever more computational resources. In many tools, parallelisation has imp
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=od______3484::1fc5ca8611102f71fe9d6afb6efcbc02
https://hdl.handle.net/2117/390396
https://hdl.handle.net/2117/390396
We propose fork-join and task-based hybrid implementations of four classical linear algebra iterative methods (Jacobi, Gauss-Seidel, conjugate gradient and biconjugate gradient stabilised) as well as variations of them. Algorithms are duly documented
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::eb9205c05f1c7c67ca0766908f115d3d
Autor:
Ali, Omar Shaaban Ibrahim, Aguilar Mena, Jimmy, Beltran Querol, Vicenç, Carpenter, Paul Matthew, Ayguadé Parra, Eduard, Labarta Mancho, Jesús José
Publikováno v:
2022 IEEE 34th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD).
Task-based programming is a high performance and productive model to express parallelism. Tasks encapsulate work to be executed across multiple cores or offloaded to GPUs, FPGAs, other accelerators or other nodes. In order to maintain parallelism and
Autor:
Jorge Ejarque, Pau Andrio, Adam Hospital, Javier Conejero, Daniele Lezzi, Josep LL. Gelpi, Rosa M. Badia
Developing complex biomolecular workflows is not always straightforward. It requires tedious developments to enable the interoperability between the different biomolecular simulation and analysis tools. Moreover, the need to execute the pipelines on
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::0a36f17c561d526efb97bc98456a62a7
http://arxiv.org/abs/2208.14130
http://arxiv.org/abs/2208.14130
Publikováno v:
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Universitat Politècnica de Catalunya (UPC)
The general matrix-matrix multiplication (GEMM) kernel is a fundamental building block of many scientific applications. Many libraries such as Intel MKL and BLIS provide highly optimized sequential and parallel versions of this kernel. The parallel i
Publikováno v:
2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software
In this work, we extend the auto-tuning process of the state-of-the-art TVM framework with XFeatur; a tool that extracts new meaningful hardware-related features that improve the quality of the representation of the search space and consequently impr