Výsledky vyhledávání - "FALSAFI, BABAK"

Report

Effective Interplay between Sparsity and Quantization: From Theory to Practice

Autor: Harma, Simla Burcu, Chakraborty, Ayan, Kostenok, Elizaveta, Mishin, Danila, Ha, Dongho, Falsafi, Babak, Jaggi, Martin, Liu, Ming, Oh, Yunho, Subramanian, Suvinay, Yazdanbakhsh, Amir

The increasing size of deep neural networks necessitates effective model compression to improve computational efficiency and reduce their memory footprint. Sparsity and quantization are two prominent compression methods that have individually demonst

Externí odkaz: http://arxiv.org/abs/2405.20935

Zobrazit plný text záznamu

Elektronická kniha

A primer on hardware prefetching / Babak Falsafi, Thomas F. Wenisch. [electronic resource]

Autor: Falsafi Babak, author

Externí odkaz: Kolekce e-knih KNAV Registrovani uzivatele: plny text online 5 minut, dalsi pristup na vyzadani. Registered users: full text online 5 minutes, further access on request.

Report

Accuracy Booster: Enabling 4-bit Fixed-point Arithmetic for DNN Training

Autor: Harma, Simla Burcu, Chakraborty, Ayan, Sperry, Nicholas, Falsafi, Babak, Jaggi, Martin, Oh, Yunho

The unprecedented demand for computing resources to train DNN models has led to a search for minimal numerical encoding. Recent state-of-the-art (SOTA) proposals advocate for multi-level scaled narrow bitwidth numerical formats. In this paper, we sho

Externí odkaz: http://arxiv.org/abs/2211.10737

Zobrazit plný text záznamu

Report

Scale-out Systolic Arrays

Autor: Yüzügüler, Ahmet Caner, Sönmez, Canberk, Drumond, Mario, Oh, Yunho, Falsafi, Babak, Frossard, Pascal

Multi-pod systolic arrays are emerging as the architecture of choice in DNN inference accelerators. Despite their potential, designing multi-pod systolic arrays to maximize effective throughput/Watt (i.e., throughput/Watt adjusted when accounting for

Externí odkaz: http://arxiv.org/abs/2203.11540

Zobrazit plný text záznamu

Report

Enabling High-Capacity, Latency-Tolerant, and Highly-Concurrent GPU Register Files via Software/Hardware Cooperation

Autor: Sadrosadati, Mohammad, Mirhosseini, Amirhossein, Hajiabadi, Ali, Ehsani, Seyed Borna, Falahati, Hajar, Sarbazi-Azad, Hamid, Drumond, Mario, Falsafi, Babak, Ausavarungnirun, Rachata, Mutlu, Onur

Graphics Processing Units (GPUs) employ large register files to accommodate all active threads and accelerate context switching. Unfortunately, register files are a scalability bottleneck for future GPUs due to long access latency, high power consump

Externí odkaz: http://arxiv.org/abs/2010.09330

Zobrazit plný text záznamu

Report

SPARTA: A Divide and Conquer Approach to Address Translation for Accelerators

Autor: Picorel, Javier, Kohroudi, Seyed Alireza Sanaee, Yan, Zi, Bhattacharjee, Abhishek, Falsafi, Babak, Jevdjic, Djordje

Virtual memory (VM) is critical to the usability and programmability of hardware accelerators. Unfortunately, implementing accelerator VM efficiently is challenging because the area and power constraints make it difficult to employ the large multi-le

Externí odkaz: http://arxiv.org/abs/2001.07045

Zobrazit plný text záznamu

Report

SMoTherSpectre: exploiting speculative execution through port contention

Autor: Bhattacharyya, Atri, Sandulescu, Alexandra, Neugschwandtner, Matthias, Sorniotti, Alessandro, Falsafi, Babak, Payer, Mathias, Kurmus, Anil

Spectre, Meltdown, and related attacks have demonstrated that kernels, hypervisors, trusted execution environments, and browsers are prone to information disclosure through micro-architectural weaknesses. However, it remains unclear as to what extent

Externí odkaz: http://arxiv.org/abs/1903.01843

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Report

Exploiting Errors for Efficiency: A Survey from Circuits to Algorithms

Autor: Stanley-Marbell, Phillip, Alaghi, Armin, Carbin, Michael, Darulova, Eva, Dolecek, Lara, Gerstlauer, Andreas, Gillani, Ghayoor, Jevdjic, Djordje, Moreau, Thierry, Cacciotti, Mattia, Daglis, Alexandros, Jerger, Natalie Enright, Falsafi, Babak, Misailovic, Sasa, Sampson, Adrian, Zufferey, Damien

When a computational task tolerates a relaxation of its specification or when an algorithm tolerates the effects of noise in its execution, hardware, programming languages, and system software can trade deviations from correct behavior for lower reso

Externí odkaz: http://arxiv.org/abs/1809.05859

Zobrazit plný text záznamu

Report

Training DNNs with Hybrid Block Floating Point

Autor: Drumond, Mario, Lin, Tao, Jaggi, Martin, Falsafi, Babak

The wide adoption of DNNs has given birth to unrelenting computing requirements, forcing datacenter operators to adopt domain-specific accelerators to train them. These accelerators typically employ densely packed full precision floating-point arithm

Externí odkaz: http://arxiv.org/abs/1804.01526

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání