Zobrazeno 1 - 10
of 1 043
pro vyhledávání: '"CANO, Jose A."'
Non-uniform quantization, such as power-of-two (PoT) quantization, matches data distributions better than uniform quantization, which reduces the quantization error of Deep Neural Networks (DNNs). PoT quantization also allows bit-shift operations to
Externí odkaz:
http://arxiv.org/abs/2409.20403
The increase in open-source availability of Large Language Models (LLMs) has enabled users to deploy them on more and more resource-constrained edge devices to reduce reliance on network connections and provide more privacy. However, the high computa
Externí odkaz:
http://arxiv.org/abs/2408.00462
As custom hardware accelerators become more prevalent, it becomes increasingly important to automatically generate efficient host-driver code that can fully leverage the capabilities of these accelerators. This approach saves time and reduces the lik
Externí odkaz:
http://arxiv.org/abs/2402.19184
Converting deep learning models between frameworks is a common step to maximize model compatibility across devices and leverage optimization features that may be exclusively provided in one deep learning framework. However, this conversion process ma
Externí odkaz:
http://arxiv.org/abs/2312.15101
Autor:
Agostini, Nicolas Bohm, Haris, Jude, Gibson, Perry, Jayaweera, Malith, Rubin, Norm, Tumeo, Antonino, Abellán, José L., Cano, José, Kaeli, David
This paper addresses the need for automatic and efficient generation of host driver code for arbitrary custom AXI-based accelerators targeting linear algebra algorithms, an important workload in various applications, including machine learning and sc
Externí odkaz:
http://arxiv.org/abs/2312.14821
Deep Neural Networks (DNNs) are extremely computationally demanding, which presents a large barrier to their deployment on resource-constrained devices. Since such devices are where many emerging deep learning applications lie (e.g., drones, vision-b
Externí odkaz:
http://arxiv.org/abs/2311.08909
When deploying Deep Neural Networks (DNNs), developers often convert models from one deep learning framework to another (e.g., TensorFlow to PyTorch). However, this process is error-prone and can impact target model accuracy. To identify the extent o
Externí odkaz:
http://arxiv.org/abs/2306.06157
Image recognition tasks typically use deep learning and require enormous processing power, thus relying on hardware accelerators like GPUs and TPUs for fast, timely processing. Failure in real-time image recognition tasks can occur due to sub-optimal
Externí odkaz:
http://arxiv.org/abs/2306.06208
The increased utilization of Artificial Intelligence (AI) solutions brings with it inherent risks, such as misclassification and sub-optimal execution time performance, due to errors introduced in their deployment infrastructure because of problemati
Externí odkaz:
http://arxiv.org/abs/2306.01697
Autor:
Ayaz, Ferheen, Zakariyya, Idris, Cano, José, Keoh, Sye Loong, Singer, Jeremy, Pau, Danilo, Kharbouche-Harrari, Mounia
Reducing the memory footprint of Machine Learning (ML) models, particularly Deep Neural Networks (DNNs), is essential to enable their deployment into resource-constrained tiny devices. However, a disadvantage of DNN models is their vulnerability to a
Externí odkaz:
http://arxiv.org/abs/2304.12829