Zobrazeno 1 - 10
of 3 046
pro vyhledávání: '"A. A. Mikhalev"'
Autor:
Chekalina, Viktoriia, Rudenko, Anna, Mezentsev, Gleb, Mikhalev, Alexander, Panchenko, Alexander, Oseledets, Ivan
The performance of Transformer models has been enhanced by increasing the number of parameters and the length of the processed text. Consequently, fine-tuning the entire model becomes a memory-intensive process. High-performance methods for parameter
Externí odkaz:
http://arxiv.org/abs/2410.07383
Autor:
Merkulov, Daniil, Cherniuk, Daria, Rudikov, Alexander, Oseledets, Ivan, Muravleva, Ekaterina, Mikhalev, Aleksandr, Kashin, Boris
In this paper, we introduce an algorithm for data quantization based on the principles of Kashin representation. This approach hinges on decomposing any given vector, matrix, or tensor into two factors. The first factor maintains a small infinity nor
Externí odkaz:
http://arxiv.org/abs/2404.09737
In this paper we generalize and extend an idea of low-rank adaptation (LoRA) of large language models (LLMs) based on Transformer architecture. Widely used LoRA-like methods of fine-tuning LLMs are based on matrix factorization of gradient update. We
Externí odkaz:
http://arxiv.org/abs/2402.01376
LoRA is a technique that reduces the number of trainable parameters in a neural network by introducing low-rank adapters to linear layers. This technique is used both for fine-tuning and full training of large language models. This paper presents the
Externí odkaz:
http://arxiv.org/abs/2312.03415
In this paper, we prove a criterion of elementary equivalence of stable linear groups over fields of characteristic two.
Comment: 10 pages
Comment: 10 pages
Externí odkaz:
http://arxiv.org/abs/2301.05613
The Linearized Poisson--Boltzmann (LPB) equation is a popular and widely accepted model for accounting solvent effects in computational (bio-) chemistry. In the present article we derive the analytical forces of the domain-decomposition-based ddLPB-m
Externí odkaz:
http://arxiv.org/abs/2203.00552
Autor:
Vodyashkin, Andrey A., Ivanova, Anastasia A., Buryanskaya, Evgeniya L., Maltsev, Alexander A., Mikhalev, Pavel A., Ryzhenko, Dmitriy S., Makeev, Mstislav.O.
Publikováno v:
In Colloids and Surfaces A: Physicochemical and Engineering Aspects 20 November 2024 703 Part 1
Autor:
Lopes, Diogo, Kovalevsky, Andrei V., Yaremchenko, Aleksey A., Mikhalev, Sergey M., Costa, F.M., Ferreira, Nuno M.
Publikováno v:
In Journal of the European Ceramic Society February 2025 45(2)
Autor:
Bershatsky, Daniel, Mikhalev, Aleksandr, Katrutsa, Alexandr, Gusak, Julia, Merkulov, Daniil, Oseledets, Ivan
In modern neural networks like Transformers, linear layers require significant memory to store activations during backward pass. This study proposes a memory reduction approach to perform backpropagation through linear layers. Since the gradients of
Externí odkaz:
http://arxiv.org/abs/2201.13195
Autor:
Alexey V. Mikhalev
Publikováno v:
RUDN Journal of Political Science, Vol 25, Iss 1, Pp 218-232 (2023)
The proposed paper is a study of resource nationalism. Resource nationalism appeared in Mongolia in the post-Socialist period. In this paper, we understand resource nationalism as a wide spectrum of strategies domestic elites employ in order to incre
Externí odkaz:
https://doaj.org/article/55a7625e61554879bb9e000bac4e4fdd