Výsledky vyhledávání

Efficient and portable Winograd convolutions for multi-core processors

Autor: Manuel F. Dolz, Héctor Martínez, Adrián Castelló, Pedro Alonso-Jordá, Enrique S. Quintana-Ortí

Publikováno v: The Journal of Supercomputing. 79:10589-10610

We take a step forward towards developing high-performance codes for the convolution operator, based on the Winograd algorithm, that are easy to customise for general-purpose processor architectures. In our approach, augmenting the portability of the

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::a3c8452f1c29d1bc08e5274adf377990
https://doi.org/10.1007/s11227-023-05088-4

Zobrazit plný text záznamu

Plný text ve formátu HTML

Akademický článek

Towards Automatic Parallelization of Stream Processing Applications

Autor: Manuel F. Dolz, David Del Rio Astorga, Javier Fernandez, J. Daniel Garcia, Jesus Carretero

Publikováno v: IEEE Access, Vol 6, Pp 39944-39961 (2018)

Parallelizing and optimizing codes for recent multi-/many-core processors have been recognized to be a complex task. For this reason, strategies to automatically transform sequential codes into parallel and discover optimization opportunities are cru

Externí odkaz: https://doaj.org/article/77d97f163c6d4837b43a582c58d075e9

Zobrazit plný text záznamu

Convolution Operators for Deep Learning Inference on the Fujitsu A64FX Processor

Autor: Manuel F. Dolz, Hector Martinez, Pedro Alonso, Enrique S. Quintana-Orti

Publikováno v: 2022 IEEE 34th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD).

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::b28fb9f03602bfe0eedd4fff58c5eeba
https://doi.org/10.1109/sbac-pad55451.2022.00027

Zobrazit plný text záznamu

Analyzing the impact of the MPI allreduce in distributed training of convolutional neural networks

Autor: Adrián Castelló, Mar Catalán, Manuel F. Dolz, Enrique S. Quintana-Ortí, José Duato

Publikováno v: Repositori Universitat Jaume I
Universitat Jaume I

For many distributed applications, data communication poses an important bottleneck from the points of view of performance and energy consumption. As more cores are integrated per node, in general the global performance of the system increases yet ev

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::e3bb0adbad38b27ee9a92a955a44f570

Zobrazit plný text záznamu

BestOf: an online implementation selector for the training and inference of deep neural networks

Autor: Sergio Barrachina, Adrián Castelló, Manuel F. Dolz, Andrés E. Tomás

Publikováno v: Repositori Universitat Jaume I
Universitat Jaume I

Tuning and optimising the operations executed in deep learning frameworks is a fundamental task in accelerating the processing of deep neural networks (DNNs). However, this optimisation usually requires extensive manual efforts in order to obtain the

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::b1de51e45b08ba63594e6733b224b0ba
http://hdl.handle.net/10234/198110

Zobrazit plný text záznamu

Towards portable realizations of winograd-based convolution with vector intrinsics and OpenMP

Autor: Manuel F. Dolz, Adrian Castello, Enrique S. Quintana-Orti

Ponència presentada en el 2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) We take a step forward in the direction of developing high performance codes for the convolution, based on the Winogra

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::5fdbf0d3b97afabd58f6760031cd3266

Zobrazit plný text záznamu

Efficient and portable GEMM-based convolution operators for deep neural network training on multicore processors

Autor: Sergio Barrachina, Manuel F. Dolz, Pablo San Juan, Enrique S. Quintana-Ortí

Publikováno v: Repositori Universitat Jaume I
Universitat Jaume I

Convolutional Neural Networks (CNNs) play a crucial role in many image recognition and classification tasks, recommender systems, brain-computer interfaces, etc. As a consequence, there is a notable interest in developing high performance realization

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::51d88b4cd8ca4f5a9919b94eb89a2417
https://doi.org/10.1016/j.jpdc.2022.05.009

Zobrazit plný text záznamu

Exploring stream parallel patterns in distributed MPI environments

Autor: Javier Muñoz, J. Daniel Garcia, David del Rio Astorga, Manuel F. Dolz, Javier López-Gómez

Publikováno v: e-Archivo: Repositorio Institucional de la Universidad Carlos III de Madrid
Universidad Carlos III de Madrid (UC3M)
e-Archivo. Repositorio Institucional de la Universidad Carlos III de Madrid
instname
Parallel Computing

In recent years, the large volumes of stream data and the near real-time requirements of data streaming applications have exacerbated the need for new scalable algorithms and programming interfaces for distributed and shared-memory platforms. To cont

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::2bf03307d72feffc6924af59246a3f67
https://doi.org/10.1016/j.parco.2019.03.004

Zobrazit plný text záznamu

A Flexible Research-Oriented Framework for Distributed Training of Deep Neural Networks

Autor: Adrián Castelló, Mar Catalan, Sergio Barrachina, Jose I. Mestre, Manuel F. Dolz

Publikováno v: IPDPS Workshops

We present PyDTNN, a framework for training deep neural networks (DNNs) on clusters of computers that has been designed as a research-oriented tool with a low learning curve. Our parallel training framework offers a set of functionalities that cover

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::66b20183768b00c290ea44e132877be4
https://doi.org/10.1109/ipdpsw52791.2021.00110

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání