Zobrazeno 1 - 10
of 17
pro vyhledávání: '"Aminabadi, Reza Yazdani"'
Autor:
Wu, Xiaoxia, Xia, Haojun, Youn, Stephen, Zheng, Zhen, Chen, Shiyang, Bakhtiari, Arash, Wyatt, Michael, Aminabadi, Reza Yazdani, He, Yuxiong, Ruwase, Olatunji, Song, Leon, Yao, Zhewei
This study examines 4-bit quantization methods like GPTQ in large language models (LLMs), highlighting GPTQ's overfitting and limited enhancement in Zero-Shot tasks. While prior works merely focusing on zero-shot measurement, we extend task scope to
Externí odkaz:
http://arxiv.org/abs/2312.08583
Quantization techniques are pivotal in reducing the memory and computational demands of deep neural network inference. Existing solutions, such as ZeroQuant, offer dynamic quantization for models like BERT and GPT but overlook crucial memory-bounded
Externí odkaz:
http://arxiv.org/abs/2310.17723
Autor:
Yao, Zhewei, Aminabadi, Reza Yazdani, Ruwase, Olatunji, Rajbhandari, Samyam, Wu, Xiaoxia, Awan, Ammar Ahmad, Rasley, Jeff, Zhang, Minjia, Li, Conglong, Holmes, Connor, Zhou, Zhongzhu, Wyatt, Michael, Smith, Molly, Kurilenko, Lev, Qin, Heyang, Tanaka, Masahiro, Che, Shuai, Song, Shuaiwen Leon, He, Yuxiong
ChatGPT-like models have revolutionized various applications in artificial intelligence, from summarization and coding to translation, matching or even surpassing human performance. However, the current landscape lacks an accessible, efficient, and c
Externí odkaz:
http://arxiv.org/abs/2308.01320
Publikováno v:
Fortieth International Conference on Machine Learning 2023
Improving the deployment efficiency of transformer-based language models has been challenging given their high computation and memory cost. While INT8 quantization has recently been shown to be effective in reducing both the memory cost and latency w
Externí odkaz:
http://arxiv.org/abs/2301.12017
Autor:
Aminabadi, Reza Yazdani, Rajbhandari, Samyam, Zhang, Minjia, Awan, Ammar Ahmad, Li, Cheng, Li, Du, Zheng, Elton, Rasley, Jeff, Smith, Shaden, Ruwase, Olatunji, He, Yuxiong
The past several years have witnessed the success of transformer-based models, and their scale and application scenarios continue to grow aggressively. The current landscape of transformer models is increasingly diverse: the model size varies drastic
Externí odkaz:
http://arxiv.org/abs/2207.00032
How to efficiently serve ever-larger trained natural language models in practice has become exceptionally challenging even for powerful cloud servers due to their prohibitive memory/computation requirements. In this work, we present an efficient and
Externí odkaz:
http://arxiv.org/abs/2206.01861
Autor:
Smith, Shaden, Patwary, Mostofa, Norick, Brandon, LeGresley, Patrick, Rajbhandari, Samyam, Casper, Jared, Liu, Zhun, Prabhumoye, Shrimai, Zerveas, George, Korthikanti, Vijay, Zhang, Elton, Child, Rewon, Aminabadi, Reza Yazdani, Bernauer, Julie, Song, Xia, Shoeybi, Mohammad, He, Yuxiong, Houston, Michael, Tiwary, Saurabh, Catanzaro, Bryan
Pretrained general-purpose language models can achieve state-of-the-art accuracies in various natural language processing domains by adapting to downstream tasks via zero-shot, few-shot and fine-tuning techniques. Because of their success, the size o
Externí odkaz:
http://arxiv.org/abs/2201.11990
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale
Autor:
Rajbhandari, Samyam, Li, Conglong, Yao, Zhewei, Zhang, Minjia, Aminabadi, Reza Yazdani, Awan, Ammar Ahmad, Rasley, Jeff, He, Yuxiong
As the training of giant dense models hits the boundary on the availability and capability of the hardware resources today, Mixture-of-Experts (MoE) models become one of the most promising model architectures due to their significant training cost re
Externí odkaz:
http://arxiv.org/abs/2201.05596
Autor:
Ren, Jie, Rajbhandari, Samyam, Aminabadi, Reza Yazdani, Ruwase, Olatunji, Yang, Shuangyan, Zhang, Minjia, Li, Dong, He, Yuxiong
Large-scale model training has been a playing ground for a limited few requiring complex model refactoring and access to prohibitively expensive GPU clusters. ZeRO-Offload changes the large model training landscape by making large model training acce
Externí odkaz:
http://arxiv.org/abs/2101.06840
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.