Výsledky vyhledávání - "Pericàs, Miquel A"

Report

Challenges and Opportunities in the Co-design of Convolutions and RISC-V Vector Processors

Autor: Gupta, Sonia Rani, Papadopoulou, Nikela, Pericàs, Miquel

The RISC-V "V" extension introduces vector processing to the RISC-V architecture. Unlike most SIMD extensions, it supports long vectors which can result in significant improvement of multiple applications. In this paper, we present our ongoing resear

Externí odkaz: http://arxiv.org/abs/2311.05284

Zobrazit plný text záznamu

Report

Analysis and Characterization of Performance Variability for OpenMP Runtime

Autor: Cui, Minyu, Papadopoulou, Nikela, Pericàs, Miquel

In the high performance computing (HPC) domain, performance variability is a major scalability issue for parallel computing applications with heavy synchronization and communication. In this paper, we present an experimental performance analysis of O

Externí odkaz: http://arxiv.org/abs/2311.05267

Zobrazit plný text záznamu

Report

JOSS: Joint Exploration of CPU-Memory DVFS and Task Scheduling for Energy Efficiency

Autor: Chen, Jing, Manivannan, Madhavan, Goel, Bhavishya, Pericàs, Miquel

Energy-efficient execution of task-based parallel applications is crucial as tasking is a widely supported feature in many parallel programming libraries and runtimes. Currently, state-of-the-art proposals primarily rely on leveraging core asymmetry

Externí odkaz: http://arxiv.org/abs/2306.04615

Zobrazit plný text záznamu

Report

ODIN: Overcoming Dynamic Interference in iNference pipelines

Autor: Soomro, Pirah Noor, Papadopoulou, Nikela, Pericàs, Miquel

As an increasing number of businesses becomes powered by machine-learning, inference becomes a core operation, with a growing trend to be offered as a service. In this context, the inference task must meet certain service-level objectives (SLOs), suc

Externí odkaz: http://arxiv.org/abs/2306.01679

Zobrazit plný text záznamu

Report

Accelerating CNN inference on long vector architectures via co-design

Autor: Gupta, Sonia Rani, Papadopoulou, Nikela, Pericas, Miquel

CPU-based inference can be an alternative to off-chip accelerators, and vector architectures are a promising option due to their efficiency. However, the large design space of convolutional algorithms and hardware implementations makes it challenging

Externí odkaz: http://arxiv.org/abs/2212.11574

Zobrazit plný text záznamu

Report

Energy-Efficiency Evaluation of OpenMP Loop Transformations and Runtime Constructs

Autor: Valter, Henrik, Karlsson, Axel, Pericàs, Miquel

OpenMP is the de facto API for parallel programming in HPC applications. These programs are often computed in data centers, where energy consumption is a major issue. Whereas previous work has focused almost entirely on performance, we here analyse a

Externí odkaz: http://arxiv.org/abs/2209.04317

Zobrazit plný text záznamu

Akademický článek

Unravelling the mechanistic ‘Black Box’ of heterogeneous condensation reactions catalyzed by aminosilicas

Autor: Borah, Parijat, Nanda Sahu, Preeti, Sen, Anik, Pericàs, Miquel A.

Publikováno v: In Journal of Catalysis December 2024 440

Zobrazit plný text záznamu

Report

At the Locus of Performance: Quantifying the Effects of Copious 3D-Stacked Cache on HPC Workloads

Autor: Domke, Jens, Vatai, Emil, Gerofi, Balazs, Kodama, Yuetsu, Wahib, Mohamed, Podobas, Artur, Mittal, Sparsh, Pericàs, Miquel, Zhang, Lingqi, Chen, Peng, Drozd, Aleksandr, Matsuoka, Satoshi

Over the last three decades, innovations in the memory subsystem were primarily targeted at overcoming the data movement bottleneck. In this paper, we focus on a specific market trend in memory technology: 3D-stacked memory and caches. We investigate

Externí odkaz: http://arxiv.org/abs/2204.02235

Zobrazit plný text záznamu

Report

Shisha: Online scheduling of CNN pipelines on heterogeneous architectures

Autor: Soomro, Pirah Noor, Abduljabbar, Mustafa, Castrillon, Jeronimo, Pericàs, Miquel

Chiplets have become a common methodology in modern chip design. Chiplets improve yield and enable heterogeneity at the level of cores, memory subsystem and the interconnect. Convolutional Neural Networks (CNNs) have high computational, bandwidth and

Externí odkaz: http://arxiv.org/abs/2202.11575

Zobrazit plný text záznamu

Report

ERASE: Energy Efficient Task Mapping and Resource Management for Work Stealing Runtimes

Autor: Chen, Jing, Manivannan, Madhavan, Abduljabbar, Mustafa, Pericàs, Miquel

Parallel applications often rely on work stealing schedulers in combination with fine-grained tasking to achieve high performance and scalability. However, reducing the total energy consumption in the context of work stealing runtimes is still challe

Externí odkaz: http://arxiv.org/abs/2201.12186

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání