Zobrazeno 1 - 9
of 9
pro vyhledávání: '"Daulbaev, Talgat"'
In this paper we generalize and extend an idea of low-rank adaptation (LoRA) of large language models (LLMs) based on Transformer architecture. Widely used LoRA-like methods of fine-tuning LLMs are based on matrix factorization of gradient update. We
Externí odkaz:
http://arxiv.org/abs/2402.01376
A conventional approach to train neural ordinary differential equations (ODEs) is to fix an ODE solver and then learn the neural network's weights to optimize a target loss function. However, such an approach is tailored for a specific discretization
Externí odkaz:
http://arxiv.org/abs/2103.08561
Autor:
Gusak, Julia, Markeeva, Larisa, Daulbaev, Talgat, Katrutsa, Alexandr, Cichocki, Andrzej, Oseledets, Ivan
Normalization is an important and vastly investigated technique in deep learning. However, its role for Ordinary Differential Equation based networks (neural ODEs) is still poorly understood. This paper investigates how different normalization techni
Externí odkaz:
http://arxiv.org/abs/2004.09222
Autor:
Daulbaev, Talgat, Katrutsa, Alexandr, Markeeva, Larisa, Gusak, Julia, Cichocki, Andrzej, Oseledets, Ivan
We propose a simple interpolation-based method for the efficient approximation of gradients in neural ODE models. We compare it with the reverse dynamic method (known in the literature as "adjoint method") to train neural ODEs on classification, dens
Externí odkaz:
http://arxiv.org/abs/2003.05271
Active subspace is a model reduction method widely used in the uncertainty quantification community. In this paper, we propose analyzing the internal structure and vulnerability and deep neural networks using active subspace. Firstly, we employ the a
Externí odkaz:
http://arxiv.org/abs/1910.13025
We introduce a new method for speeding up the inference of deep neural networks. It is somewhat inspired by the reduced-order modeling techniques for dynamical systems.The cornerstone of the proposed method is the maximum volume algorithm. We demonst
Externí odkaz:
http://arxiv.org/abs/1910.06995
This paper proposes the method to optimize restriction and prolongation operators in the two-grid method. The proposed method is straightforwardly extended to the geometric multigrid method (GMM). GMM is used in solving discretized partial differenti
Externí odkaz:
http://arxiv.org/abs/1711.03825
Publikováno v:
In Journal of Computational and Applied Mathematics April 2020 368
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.