Zobrazeno 1 - 10
of 51
pro vyhledávání: '"Umuroglu, Yaman"'
Quantization techniques commonly reduce the inference costs of neural networks by restricting the precision of weights and activations. Recent studies show that also reducing the precision of the accumulator can further improve hardware efficiency at
Externí odkaz:
http://arxiv.org/abs/2401.10432
Autor:
Borras, Hendrik, Di Guglielmo, Giuseppe, Duarte, Javier, Ghielmetti, Nicolò, Hawks, Ben, Hauck, Scott, Hsu, Shih-Chieh, Kastner, Ryan, Liang, Jason, Meza, Andres, Muhizi, Jules, Nguyen, Tai, Roy, Rushil, Tran, Nhan, Umuroglu, Yaman, Weng, Olivia, Yokuda, Aidan, Blott, Michaela
We present our development experience and recent results for the MLPerf Tiny Inference Benchmark on field-programmable gate array (FPGA) platforms. We use the open-source hls4ml and FINN workflows, which aim to democratize AI-hardware codesign of opt
Externí odkaz:
http://arxiv.org/abs/2206.11791
Autor:
Pappalardo, Alessandro, Umuroglu, Yaman, Blott, Michaela, Mitrevski, Jovan, Hawks, Ben, Tran, Nhan, Loncar, Vladimir, Summers, Sioni, Borras, Hendrik, Muhizi, Jules, Trahms, Matthew, Hsu, Shih-Chieh, Hauck, Scott, Duarte, Javier
We present extensions to the Open Neural Network Exchange (ONNX) intermediate representation format to represent arbitrary-precision quantized neural networks. We first introduce support for low precision quantization in existing ONNX-based quantizat
Externí odkaz:
http://arxiv.org/abs/2206.07527
Autor:
Orosa, Lois, Koppula, Skanda, Umuroglu, Yaman, Kanellopoulos, Konstantinos, Gomez-Luna, Juan, Blott, Michaela, Vissers, Kees, Mutlu, Onur
Dilated and transposed convolutions are widely used in modern convolutional neural networks (CNNs). These kernels are used extensively during CNN training and inference of applications such as image segmentation and high-resolution image generation.
Externí odkaz:
http://arxiv.org/abs/2202.02310
Autor:
Hawks, Benjamin, Duarte, Javier, Fraser, Nicholas J., Pappalardo, Alessandro, Tran, Nhan, Umuroglu, Yaman
Publikováno v:
Front. AI 4, 94 (2021)
Efficient machine learning implementations optimized for inference in hardware have wide-ranging benefits, depending on the application, from lower inference latency to higher data throughput and reduced energy consumption. Two popular techniques for
Externí odkaz:
http://arxiv.org/abs/2102.11289
Deployment of deep neural networks for applications that require very high throughput or extremely low latency is a severe computational challenge, further exacerbated by inefficiencies in mapping the computation to hardware. We present a novel metho
Externí odkaz:
http://arxiv.org/abs/2004.03021
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Autor:
Umuroglu, Yaman, Conficconi, Davide, Rasnayake, Lahiru, Preusser, Thomas B., Sjalander, Magnus
Matrix-matrix multiplication is a key computational kernel for numerous applications in science and engineering, with ample parallelism and data locality that lends itself well to high-performance implementations. Many matrix multiplication-dependent
Externí odkaz:
http://arxiv.org/abs/1901.00370
Autor:
Blott, Michaela, Preusser, Thomas, Fraser, Nicholas, Gambardella, Giulio, O'Brien, Kenneth, Umuroglu, Yaman
Convolutional Neural Networks have rapidly become the most successful machine learning algorithm, enabling ubiquitous machine vision and intelligent decisions on even embedded computing-systems. While the underlying arithmetic is structurally simple,
Externí odkaz:
http://arxiv.org/abs/1809.04570
Scaling Neural Network Performance through Customized Hardware Architectures on Reconfigurable Logic
Autor:
Blott, Michaela, Preusser, Thomas B., Fraser, Nicholas, Gambardella, Giulio, OBrien, Kenneth, Umuroglu, Yaman, Leeser, Miriam
Convolutional Neural Networks have dramatically improved in recent years, surpassing human accuracy on certain problems and performance exceeding that of traditional computer vision algorithms. While the compute pattern in itself is relatively simple
Externí odkaz:
http://arxiv.org/abs/1807.03123