Výsledky vyhledávání - "Umuroglu, Yaman"

Report

A2Q+: Improving Accumulator-Aware Weight Quantization

Autor: Colbert, Ian, Pappalardo, Alessandro, Petri-Koenig, Jakoba, Umuroglu, Yaman

Quantization techniques commonly reduce the inference costs of neural networks by restricting the precision of weights and activations. Recent studies show that also reducing the precision of the accumulator can further improve hardware efficiency at

Externí odkaz: http://arxiv.org/abs/2401.10432

Zobrazit plný text záznamu

Report

Open-source FPGA-ML codesign for the MLPerf Tiny Benchmark

Autor: Borras, Hendrik, Di Guglielmo, Giuseppe, Duarte, Javier, Ghielmetti, Nicolò, Hawks, Ben, Hauck, Scott, Hsu, Shih-Chieh, Kastner, Ryan, Liang, Jason, Meza, Andres, Muhizi, Jules, Nguyen, Tai, Roy, Rushil, Tran, Nhan, Umuroglu, Yaman, Weng, Olivia, Yokuda, Aidan, Blott, Michaela

We present our development experience and recent results for the MLPerf Tiny Inference Benchmark on field-programmable gate array (FPGA) platforms. We use the open-source hls4ml and FINN workflows, which aim to democratize AI-hardware codesign of opt

Externí odkaz: http://arxiv.org/abs/2206.11791

Zobrazit plný text záznamu

Report

QONNX: Representing Arbitrary-Precision Quantized Neural Networks

Autor: Pappalardo, Alessandro, Umuroglu, Yaman, Blott, Michaela, Mitrevski, Jovan, Hawks, Ben, Tran, Nhan, Loncar, Vladimir, Summers, Sioni, Borras, Hendrik, Muhizi, Jules, Trahms, Matthew, Hsu, Shih-Chieh, Hauck, Scott, Duarte, Javier

We present extensions to the Open Neural Network Exchange (ONNX) intermediate representation format to represent arbitrary-precision quantized neural networks. We first introduce support for low precision quantization in existing ONNX-based quantizat

Externí odkaz: http://arxiv.org/abs/2206.07527

Zobrazit plný text záznamu

Report

EcoFlow: Efficient Convolutional Dataflows for Low-Power Neural Network Accelerators

Autor: Orosa, Lois, Koppula, Skanda, Umuroglu, Yaman, Kanellopoulos, Konstantinos, Gomez-Luna, Juan, Blott, Michaela, Vissers, Kees, Mutlu, Onur

Dilated and transposed convolutions are widely used in modern convolutional neural networks (CNNs). These kernels are used extensively during CNN training and inference of applications such as image segmentation and high-resolution image generation.

Externí odkaz: http://arxiv.org/abs/2202.02310

Zobrazit plný text záznamu

Report

Ps and Qs: Quantization-aware pruning for efficient low latency neural network inference

Autor: Hawks, Benjamin, Duarte, Javier, Fraser, Nicholas J., Pappalardo, Alessandro, Tran, Nhan, Umuroglu, Yaman

Publikováno v: Front. AI 4, 94 (2021)

Efficient machine learning implementations optimized for inference in hardware have wide-ranging benefits, depending on the application, from lower inference latency to higher data throughput and reduced energy consumption. Two popular techniques for

Externí odkaz: http://arxiv.org/abs/2102.11289

Zobrazit plný text záznamu

Report

LogicNets: Co-Designed Neural Networks and Circuits for Extreme-Throughput Applications

Autor: Umuroglu, Yaman, Akhauri, Yash, Fraser, Nicholas J., Blott, Michaela

Deployment of deep neural networks for applications that require very high throughput or extremely low latency is a severe computational challenge, further exacerbated by inefficiencies in mapping the computation to hardware. We present a novel metho

Externí odkaz: http://arxiv.org/abs/2004.03021

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Report

Optimizing Bit-Serial Matrix Multiplication for Reconfigurable Computing

Autor: Umuroglu, Yaman, Conficconi, Davide, Rasnayake, Lahiru, Preusser, Thomas B., Sjalander, Magnus

Matrix-matrix multiplication is a key computational kernel for numerous applications in science and engineering, with ample parallelism and data locality that lends itself well to high-performance implementations. Many matrix multiplication-dependent

Externí odkaz: http://arxiv.org/abs/1901.00370

Zobrazit plný text záznamu

Report

FINN-R: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks

Autor: Blott, Michaela, Preusser, Thomas, Fraser, Nicholas, Gambardella, Giulio, O'Brien, Kenneth, Umuroglu, Yaman

Convolutional Neural Networks have rapidly become the most successful machine learning algorithm, enabling ubiquitous machine vision and intelligent decisions on even embedded computing-systems. While the underlying arithmetic is structurally simple,

Externí odkaz: http://arxiv.org/abs/1809.04570

Zobrazit plný text záznamu

Report

Scaling Neural Network Performance through Customized Hardware Architectures on Reconfigurable Logic

Autor: Blott, Michaela, Preusser, Thomas B., Fraser, Nicholas, Gambardella, Giulio, OBrien, Kenneth, Umuroglu, Yaman, Leeser, Miriam

Convolutional Neural Networks have dramatically improved in recent years, surpassing human accuracy on certain problems and performance exceeding that of traditional computer vision algorithms. While the compute pattern in itself is relatively simple

Externí odkaz: http://arxiv.org/abs/1807.03123

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání