Zobrazeno 1 - 4
of 4
pro vyhledávání: '"Svirschevski, Ruslan"'
Autor:
Egiazarian, Vage, Kuznedelev, Denis, Voronov, Anton, Svirschevski, Ruslan, Goin, Michael, Pavlov, Daniil, Alistarh, Dan, Baranchuk, Dmitry
Text-to-image diffusion models have emerged as a powerful framework for high-quality image generation given textual prompts. Their success has driven the rapid development of production-grade diffusion models that consistently increase in size and al
Externí odkaz:
http://arxiv.org/abs/2409.00492
As large language models gain widespread adoption, running them efficiently becomes crucial. Recent works on LLM inference use speculative decoding to achieve extreme speedups. However, most of these works implicitly design their algorithms for high-
Externí odkaz:
http://arxiv.org/abs/2406.02532
Autor:
Chen, Zhuoming, May, Avner, Svirschevski, Ruslan, Huang, Yuhsun, Ryabinin, Max, Jia, Zhihao, Chen, Beidi
As the usage of large language models (LLMs) grows, performing efficient inference with these models becomes increasingly important. While speculative decoding has recently emerged as a promising direction for speeding up inference, existing methods
Externí odkaz:
http://arxiv.org/abs/2402.12374
Autor:
Dettmers, Tim, Svirschevski, Ruslan, Egiazarian, Vage, Kuznedelev, Denis, Frantar, Elias, Ashkboos, Saleh, Borzunov, Alexander, Hoefler, Torsten, Alistarh, Dan
Recent advances in large language model (LLM) pretraining have led to high-quality LLMs with impressive abilities. By compressing such LLMs via quantization to 3-4 bits per parameter, they can fit into memory-limited devices such as laptops and mobil
Externí odkaz:
http://arxiv.org/abs/2306.03078