Výsledky vyhledávání - "Yazdanbakhsh, Amir"

Report

CodeRosetta: Pushing the Boundaries of Unsupervised Code Translation for Parallel Programming

Autor: TehraniJamsaz, Ali, Bhattacharjee, Arijit, Chen, Le, Ahmed, Nesreen K., Yazdanbakhsh, Amir, Jannesari, Ali

Recent advancements in Large Language Models (LLMs) have renewed interest in automatic programming language translation. Encoder-decoder transformer models, in particular, have shown promise in translating between different programming languages. How

Externí odkaz: http://arxiv.org/abs/2410.20527

Zobrazit plný text záznamu

Report

When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models

Autor: You, Haoran, Fu, Yichao, Wang, Zheng, Yazdanbakhsh, Amir, Lin, Yingyan Celine

Autoregressive Large Language Models (LLMs) have achieved impressive performance in language tasks but face two significant bottlenecks: (1) quadratic complexity in the attention module as the number of tokens increases, and (2) limited efficiency du

Externí odkaz: http://arxiv.org/abs/2406.07368

Zobrazit plný text záznamu

Report

ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization

Autor: You, Haoran, Guo, Yipin, Fu, Yichao, Zhou, Wei, Shi, Huihong, Zhang, Xiaofan, Kundu, Souvik, Yazdanbakhsh, Amir, Lin, Yingyan Celine

Large language models (LLMs) have shown impressive performance on language tasks but face challenges when deployed on resource-constrained devices due to their extensive parameters and reliance on dense multiplications, resulting in high memory deman

Externí odkaz: http://arxiv.org/abs/2406.05981

Zobrazit plný text záznamu

Report

Effective Interplay between Sparsity and Quantization: From Theory to Practice

Autor: Harma, Simla Burcu, Chakraborty, Ayan, Kostenok, Elizaveta, Mishin, Danila, Ha, Dongho, Falsafi, Babak, Jaggi, Martin, Liu, Ming, Oh, Yunho, Subramanian, Suvinay, Yazdanbakhsh, Amir

The increasing size of deep neural networks necessitates effective model compression to improve computational efficiency and reduce their memory footprint. Sparsity and quantization are two prominent compression methods that have individually demonst

Externí odkaz: http://arxiv.org/abs/2405.20935

Zobrazit plný text záznamu

Report

SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs

Autor: Mozaffari, Mohammad, Yazdanbakhsh, Amir, Zhang, Zhao, Dehnavi, Maryam Mehri

We propose SLoPe, a Double-Pruned Sparse Plus Lazy Low-rank Adapter Pretraining method for LLMs that improves the accuracy of sparse LLMs while accelerating their pretraining and inference and reducing their memory footprint. Sparse pretraining of LL

Externí odkaz: http://arxiv.org/abs/2405.16325

Zobrazit plný text záznamu

Report

Tao: Re-Thinking DL-based Microarchitecture Simulation

Autor: Pandey, Santosh, Yazdanbakhsh, Amir, Liu, Hang

Microarchitecture simulators are indispensable tools for microarchitecture designers to validate, estimate, and optimize new hardware that meets specific design requirements. While the quest for a fast, accurate and detailed microarchitecture simulat

Externí odkaz: http://arxiv.org/abs/2404.10921

Zobrazit plný text záznamu

Report

DaCapo: Accelerating Continuous Learning in Autonomous Systems for Video Analytics

Autor: Kim, Yoonsung, Oh, Changhun, Hwang, Jinwoo, Kim, Wonung, Oh, Seongryong, Lee, Yubin, Sharma, Hardik, Yazdanbakhsh, Amir, Park, Jongse

Publikováno v: ISCA 2024

Deep neural network (DNN) video analytics is crucial for autonomous systems such as self-driving vehicles, unmanned aerial vehicles (UAVs), and security robots. However, real-world deployment faces challenges due to their limited computational resour

Externí odkaz: http://arxiv.org/abs/2403.14353

Zobrazit plný text záznamu

Report

Progressive Gradient Flow for Robust N:M Sparsity Training in Transformers

Autor: Bambhaniya, Abhimanyu Rajeshkumar, Yazdanbakhsh, Amir, Subramanian, Suvinay, Kao, Sheng-Chun, Agrawal, Shivani, Evci, Utku, Krishna, Tushar

N:M Structured sparsity has garnered significant interest as a result of relatively modest overhead and improved efficiency. Additionally, this form of sparsity holds considerable appeal for reducing the memory footprint owing to their modest represe

Externí odkaz: http://arxiv.org/abs/2402.04744

Zobrazit plný text záznamu

Report

USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models

Autor: Ding, Shaojin, Qiu, David, Rim, David, He, Yanzhang, Rybakov, Oleg, Li, Bo, Prabhavalkar, Rohit, Wang, Weiran, Sainath, Tara N., Han, Zhonglin, Li, Jian, Yazdanbakhsh, Amir, Agrawal, Shivani

End-to-end automatic speech recognition (ASR) models have seen revolutionary quality gains with the recent development of large-scale universal speech models (USM). However, deploying these massive USMs is extremely expensive due to the enormous memo

Externí odkaz: http://arxiv.org/abs/2312.08553

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání