Zobrazeno 1 - 10
of 375
pro vyhledávání: '"FALSAFI, BABAK"'
Autor:
Harma, Simla Burcu, Chakraborty, Ayan, Kostenok, Elizaveta, Mishin, Danila, Ha, Dongho, Falsafi, Babak, Jaggi, Martin, Liu, Ming, Oh, Yunho, Subramanian, Suvinay, Yazdanbakhsh, Amir
The increasing size of deep neural networks necessitates effective model compression to improve computational efficiency and reduce their memory footprint. Sparsity and quantization are two prominent compression methods that have individually demonst
Externí odkaz:
http://arxiv.org/abs/2405.20935
Autor:
Harma, Simla Burcu, Chakraborty, Ayan, Sperry, Nicholas, Falsafi, Babak, Jaggi, Martin, Oh, Yunho
The unprecedented demand for computing resources to train DNN models has led to a search for minimal numerical encoding. Recent state-of-the-art (SOTA) proposals advocate for multi-level scaled narrow bitwidth numerical formats. In this paper, we sho
Externí odkaz:
http://arxiv.org/abs/2211.10737
Autor:
Yüzügüler, Ahmet Caner, Sönmez, Canberk, Drumond, Mario, Oh, Yunho, Falsafi, Babak, Frossard, Pascal
Multi-pod systolic arrays are emerging as the architecture of choice in DNN inference accelerators. Despite their potential, designing multi-pod systolic arrays to maximize effective throughput/Watt (i.e., throughput/Watt adjusted when accounting for
Externí odkaz:
http://arxiv.org/abs/2203.11540
Autor:
Sadrosadati, Mohammad, Mirhosseini, Amirhossein, Hajiabadi, Ali, Ehsani, Seyed Borna, Falahati, Hajar, Sarbazi-Azad, Hamid, Drumond, Mario, Falsafi, Babak, Ausavarungnirun, Rachata, Mutlu, Onur
Graphics Processing Units (GPUs) employ large register files to accommodate all active threads and accelerate context switching. Unfortunately, register files are a scalability bottleneck for future GPUs due to long access latency, high power consump
Externí odkaz:
http://arxiv.org/abs/2010.09330
Autor:
Picorel, Javier, Kohroudi, Seyed Alireza Sanaee, Yan, Zi, Bhattacharjee, Abhishek, Falsafi, Babak, Jevdjic, Djordje
Virtual memory (VM) is critical to the usability and programmability of hardware accelerators. Unfortunately, implementing accelerator VM efficiently is challenging because the area and power constraints make it difficult to employ the large multi-le
Externí odkaz:
http://arxiv.org/abs/2001.07045
Autor:
Bhattacharyya, Atri, Sandulescu, Alexandra, Neugschwandtner, Matthias, Sorniotti, Alessandro, Falsafi, Babak, Payer, Mathias, Kurmus, Anil
Spectre, Meltdown, and related attacks have demonstrated that kernels, hypervisors, trusted execution environments, and browsers are prone to information disclosure through micro-architectural weaknesses. However, it remains unclear as to what extent
Externí odkaz:
http://arxiv.org/abs/1903.01843
Autor:
Stanley-Marbell, Phillip, Alaghi, Armin, Carbin, Michael, Darulova, Eva, Dolecek, Lara, Gerstlauer, Andreas, Gillani, Ghayoor, Jevdjic, Djordje, Moreau, Thierry, Cacciotti, Mattia, Daglis, Alexandros, Jerger, Natalie Enright, Falsafi, Babak, Misailovic, Sasa, Sampson, Adrian, Zufferey, Damien
When a computational task tolerates a relaxation of its specification or when an algorithm tolerates the effects of noise in its execution, hardware, programming languages, and system software can trade deviations from correct behavior for lower reso
Externí odkaz:
http://arxiv.org/abs/1809.05859
The wide adoption of DNNs has given birth to unrelenting computing requirements, forcing datacenter operators to adopt domain-specific accelerators to train them. These accelerators typically employ densely packed full precision floating-point arithm
Externí odkaz:
http://arxiv.org/abs/1804.01526
Autor:
Ustiugov, Dmitrii, Daglis, Alexandros, Picorel, Javier, Sutherland, Mark, Bugnion, Edouard, Falsafi, Babak, Pnevmatikatos, Dionisios
With emerging storage-class memory (SCM) nearing commercialization, there is evidence that it will deliver the much-anticipated high density and access latencies within only a few factors of DRAM. Nevertheless, the latency-sensitive nature of memory-
Externí odkaz:
http://arxiv.org/abs/1801.06726