Zobrazeno 1 - 10
of 2 253
pro vyhledávání: '"Ryabinin, A."'
Autor:
Zmushko, Philip, Mansurov, Marat, Svirschevski, Ruslan, Kuznedelev, Denis, Ryabinin, Max, Beznosikov, Aleksandr
As deep learning models become larger and more expensive, many practitioners turn to fine-tuning APIs. These web services allow fine-tuning a model between two parties: the client that provides the data, and the server that hosts the model. While con
Externí odkaz:
http://arxiv.org/abs/2412.16669
Autor:
Jaghouar, Sami, Ong, Jack Min, Basra, Manveer, Obeid, Fares, Straube, Jannik, Keiblinger, Michael, Bakouch, Elie, Atkins, Lucas, Panahi, Maziyar, Goddard, Charles, Ryabinin, Max, Hagemann, Johannes
In this report, we introduce INTELLECT-1, the first 10 billion parameter language model collaboratively trained across the globe, demonstrating that large-scale model training is no longer confined to large corporations but can be achieved through a
Externí odkaz:
http://arxiv.org/abs/2412.01152
Autor:
Weber, Maurice, Fu, Daniel, Anthony, Quentin, Oren, Yonatan, Adams, Shane, Alexandrov, Anton, Lyu, Xiaozhong, Nguyen, Huu, Yao, Xiaozhe, Adams, Virginia, Athiwaratkun, Ben, Chalamala, Rahul, Chen, Kezhen, Ryabinin, Max, Dao, Tri, Liang, Percy, Ré, Christopher, Rish, Irina, Zhang, Ce
Large language models are increasingly becoming a cornerstone technology in artificial intelligence, the sciences, and society as a whole, yet the optimal strategies for dataset composition and filtering remain largely elusive. Many of the top-perfor
Externí odkaz:
http://arxiv.org/abs/2411.12372
Autor:
Wang, Jiayi, Lu, Yao, Weber, Maurice, Ryabinin, Max, Chen, Yihong, Tang, Raphael, Stenetorp, Pontus
English, as a very high-resource language, enables the pretraining of high-quality large language models (LLMs). The same cannot be said for most other languages, as leading LLMs still underperform for non-English languages, likely due to a gap in th
Externí odkaz:
http://arxiv.org/abs/2410.23956
As large language models gain widespread adoption, running them efficiently becomes crucial. Recent works on LLM inference use speculative decoding to achieve extreme speedups. However, most of these works implicitly design their algorithms for high-
Externí odkaz:
http://arxiv.org/abs/2406.02532
Autor:
Hong, Giwon, Gema, Aryo Pradipta, Saxena, Rohit, Du, Xiaotang, Nie, Ping, Zhao, Yu, Perez-Beltrachini, Laura, Ryabinin, Max, He, Xuanli, Fourrier, Clémentine, Minervini, Pasquale
Large Language Models (LLMs) have transformed the Natural Language Processing (NLP) landscape with their remarkable ability to understand and generate human-like text. However, these models are prone to ``hallucinations'' -- outputs that do not align
Externí odkaz:
http://arxiv.org/abs/2404.05904
Autor:
Chen, Zhuoming, May, Avner, Svirschevski, Ruslan, Huang, Yuhsun, Ryabinin, Max, Jia, Zhihao, Chen, Beidi
As the usage of large language models (LLMs) grows, performing efficient inference with these models becomes increasingly important. While speculative decoding has recently emerged as a promising direction for speeding up inference, existing methods
Externí odkaz:
http://arxiv.org/abs/2402.12374
Large language models demonstrate a remarkable capability for learning to solve new tasks from a few examples. The prompt template, or the way the input examples are formatted to obtain the prompt, is an important yet often overlooked aspect of in-co
Externí odkaz:
http://arxiv.org/abs/2401.06766
Autor:
Borzunov, Alexander, Ryabinin, Max, Chumachenko, Artem, Baranchuk, Dmitry, Dettmers, Tim, Belkada, Younes, Samygin, Pavel, Raffel, Colin
Large language models (LLMs) are useful in many NLP tasks and become more capable with size, with the best open-source models having over 50 billion parameters. However, using these 50B+ models requires high-end hardware, making them inaccessible to
Externí odkaz:
http://arxiv.org/abs/2312.08361
Autor:
Baryshnikov, Anton, Ryabinin, Max
Text-to-image synthesis has recently attracted widespread attention due to rapidly improving quality and numerous practical applications. However, the language understanding capabilities of text-to-image models are still poorly understood, which make
Externí odkaz:
http://arxiv.org/abs/2310.09247