Zobrazeno 1 - 10
of 71
pro vyhledávání: '"Perotti, Matteo"'
Autor:
Perotti, Matteo, Raeber, Michele, Sinigaglia, Mattia, Cavalcante, Matheus, Rossi, Davide, Benini, Luca
Multi-core vector processor architectures excel in handling computationally intensive vectorizable tasks but struggle to achieve optimal resource utilization when facing sequential and control tasks that cannot be vectorized. This work presents Spatz
Externí odkaz:
http://arxiv.org/abs/2407.05447
Autor:
Rogenmoser, Michael, Ottaviano, Alessandro, Benz, Thomas, Balas, Robert, Perotti, Matteo, Garofalo, Angelo, Benini, Luca
In the last decade, we have witnessed exponential growth in the complexity of control systems for safety-critical applications (automotive, robots, industrial automation) and their transition to heterogeneous mixed-criticality systems (MCSs). The gro
Externí odkaz:
http://arxiv.org/abs/2406.06546
Dense Matrix Multiplication (MatMul) is arguably one of the most ubiquitous compute-intensive kernels, spanning linear algebra, DSP, graphics, and machine learning applications. Thus, MatMul optimization is crucial not only in high-performance proces
Externí odkaz:
http://arxiv.org/abs/2401.04012
Sparse matrix vector multiplication (SpMV) is central to numerous data-intensive applications, but requires streaming indirect memory accesses that severely degrade both processing and memory throughput in state-of-the-art architectures. Near-memory
Externí odkaz:
http://arxiv.org/abs/2311.10378
From classical HPC to deep learning, MatMul is at the heart of today's computing. The recent Maddness method approximates MatMul without the need for multiplication by using a hash-based version of product quantization (PQ) indexing into a look-up ta
Externí odkaz:
http://arxiv.org/abs/2311.10207
Vector processing is highly effective in boosting processor performance and efficiency for data-parallel workloads. In this paper, we present Ara2, the first fully open-source vector processor to support the RISC-V V 1.0 frozen ISA. We evaluate Ara2'
Externí odkaz:
http://arxiv.org/abs/2311.07493
The ever-increasing computational and storage requirements of modern applications and the slowdown of technology scaling pose major challenges to designing and implementing efficient computer architectures. In this paper, we leverage the architectura
Externí odkaz:
http://arxiv.org/abs/2309.10137
DARKSIDE: A Heterogeneous RISC-V Compute Cluster for Extreme-Edge On-Chip DNN Inference and Training
Autor:
Garofalo, Angelo, Tortorella, Yvan, Perotti, Matteo, Valente, Luca, Nadalini, Alessandro, Benini, Luca, Rossi, Davide, Conti, Francesco
On-chip DNN inference and training at the Extreme-Edge (TinyML) impose strict latency, throughput, accuracy and flexibility requirements. Heterogeneous clusters are promising solutions to meet the challenge, combining the flexibility of DSP-enhanced
Externí odkaz:
http://arxiv.org/abs/2303.17954
Autor:
AskariHemmat, MohammadHossein, Dupuis, Theo, Fournier, Yoan, Zarif, Nizar El, Cavalcante, Matheus, Perotti, Matteo, Gurkaynak, Frank, Benini, Luca, Leduc-Primeau, Francois, Savaria, Yvon, David, Jean-Pierre
In this paper, we present Quark, an integer RISC-V vector processor specifically tailored for sub-byte DNN inference. Quark is implemented in GlobalFoundries' 22FDX FD-SOI technology. It is designed on top of Ara, an open-source 64-bit RISC-V vector
Externí odkaz:
http://arxiv.org/abs/2302.05996
Data-intensive applications involving irregular memory streams are inefficiently handled by modern processors and memory systems highly optimized for regular, contiguous data. Recent work tackles these inefficiencies in hardware through core-side str
Externí odkaz:
http://arxiv.org/abs/2211.10409