Zobrazeno 1 - 10
of 6 471
pro vyhledávání: '"Verhelst, A"'
The widely-used, weight-only quantized large language models (LLMs), which leverage low-bit integer (INT) weights and retain floating-point (FP) activations, reduce storage requirements while maintaining accuracy. However, this shifts the energy and
Externí odkaz:
http://arxiv.org/abs/2411.15982
Autor:
Yi, Xiaoling, Antonio, Ryan, Dumoulin, Joren, Sun, Jiacong, Van Delm, Josse, Paim, Guilherme, Verhelst, Marian
Deep neural networks (DNNs) face significant challenges when deployed on resource-constrained extreme edge devices due to their computational and data-intensive nature. While standalone accelerators tailored for specific application scenarios suffer
Externí odkaz:
http://arxiv.org/abs/2411.09543
Wrinkling is the phenomenon of out-of-plane deformation patterns in thin walled structures, as a result of a local compressive (internal) loads in combination with a large membrane stiffness and a small but non-zero bending stiffness. Numerical model
Externí odkaz:
http://arxiv.org/abs/2410.16990
Autor:
Hamdi, Mohamed Amine, Daghero, Francesco, Sarda, Giuseppe Maria, Van Delm, Josse, Symons, Arne, Benini, Luca, Verhelst, Marian, Pagliari, Daniele Jahier, Burrello, Alessio
Streamlining the deployment of Deep Neural Networks (DNNs) on heterogeneous edge platforms, coupling within the same micro-controller unit (MCU) instruction processors and hardware accelerators for tensor computations, is becoming one of the crucial
Externí odkaz:
http://arxiv.org/abs/2410.08855
Autor:
Houshmand, Pouya, Verhelst, Marian
In-memory computing hardware accelerators allow more than 10x improvements in peak efficiency and performance for matrix-vector multiplications (MVM) compared to conventional digital designs. For this, they have gained great interest for the accelera
Externí odkaz:
http://arxiv.org/abs/2409.11437
Publikováno v:
in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 31, no. 7, pp. 945-958, July 2023
To achieve high accuracy, convolutional neural networks (CNNs) are increasingly growing in complexity and diversity in layer types and topologies. This makes it very challenging to efficiently deploy such networks on custom processor architectures fo
Externí odkaz:
http://arxiv.org/abs/2406.13752
Publikováno v:
2023 IEEE International Symposium on Workload Characterization (IISWC)
GPGPU execution analysis has always been tied to closed-source, proprietary benchmarking tools that provide high-level, non-exhaustive, and/or statistical information, preventing a thorough understanding of bottlenecks and optimization possibilities.
Externí odkaz:
http://arxiv.org/abs/2407.11999
Autor:
Shi, Man, Colleman, Steven, VanDeMieroop, Charlotte, Joseph, Antony, Meijer, Maurice, Dehaene, Wim, Verhelst, Marian
Publikováno v:
2023 24th International Symposium on Quality Electronic Design (ISQED)
Deep neural networks (DNN) use a wide range of network topologies to achieve high accuracy within diverse applications. This model diversity makes it impossible to identify a single "dataflow" (execution schedule) to perform optimally across all poss
Externí odkaz:
http://arxiv.org/abs/2406.14574
The impact of transformer networks is booming, yet, they come with significant computational complexity. It is therefore essential to understand how to optimally map and execute these networks on modern neural processor hardware. So far, literature o
Externí odkaz:
http://arxiv.org/abs/2406.09804
Autor:
Van Delm, Josse, Vandersteegen, Maarten, Burrello, Alessio, Sarda, Giuseppe Maria, Conti, Francesco, Pagliari, Daniele Jahier, Benini, Luca, Verhelst, Marian
Publikováno v:
2023 60th ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, 2023, pp. 1-6
Optimal deployment of deep neural networks (DNNs) on state-of-the-art Systems-on-Chips (SoCs) is crucial for tiny machine learning (TinyML) at the edge. The complexity of these SoCs makes deployment non-trivial, as they typically contain multiple het
Externí odkaz:
http://arxiv.org/abs/2406.07453