Zobrazeno 1 - 4
of 4
pro vyhledávání: '"Wright, Less"'
Autor:
Liang, Wanchao, Liu, Tianyu, Wright, Less, Constable, Will, Gu, Andrew, Huang, Chien-Chin, Zhang, Iris, Feng, Wei, Huang, Howard, Wang, Junjie, Purandare, Sanket, Nadathur, Gokul, Idreos, Stratos
The development of large language models (LLMs) has been instrumental in advancing state-of-the-art natural language processing applications. Training LLMs with billions of parameters and trillions of tokens require sophisticated distributed systems
Externí odkaz:
http://arxiv.org/abs/2410.06511
We propose an implementation of an efficient fused matrix multiplication kernel for W4A16 quantized inference, where we perform dequantization and GEMM in a fused kernel using a SplitK work decomposition. Our implementation shows improvement for the
Externí odkaz:
http://arxiv.org/abs/2402.00025
Autor:
Zhao, Yanli, Gu, Andrew, Varma, Rohan, Luo, Liang, Huang, Chien-Chin, Xu, Min, Wright, Less, Shojanazeri, Hamid, Ott, Myle, Shleifer, Sam, Desmaison, Alban, Balioglu, Can, Damania, Pritam, Nguyen, Bernard, Chauhan, Geeta, Hao, Yuchen, Mathews, Ajit, Li, Shen
It is widely acknowledged that large models have the potential to deliver superior performance across a broad range of domains. Despite the remarkable progress made in the field of machine learning systems research, which has enabled the development
Externí odkaz:
http://arxiv.org/abs/2304.11277
Autor:
Wright, Less, Demeure, Nestor
As optimizers are critical to the performances of neural networks, every year a large number of papers innovating on the subject are published. However, while most of these publications provide incremental improvements to existing algorithms, they te
Externí odkaz:
http://arxiv.org/abs/2106.13731