Zobrazeno 1 - 10
of 86
pro vyhledávání: '"Horváth, Samuel"'
With the increase in the number of parameters in large language models, the process of pre-training and fine-tuning increasingly demands larger volumes of GPU memory. A significant portion of this memory is typically consumed by the optimizer state.
Externí odkaz:
http://arxiv.org/abs/2411.07837
Non-iid data is prevalent in real-world federated learning problems. Data heterogeneity can come in different types in terms of distribution shifts. In this work, we are interested in the heterogeneity that comes from concept shifts, i.e., shifts in
Externí odkaz:
http://arxiv.org/abs/2410.03497
Statistical data heterogeneity is a significant barrier to convergence in federated learning (FL). While prior work has advanced heterogeneous FL through better optimization objectives, these methods fall short when there is extreme data heterogeneit
Externí odkaz:
http://arxiv.org/abs/2410.03042
Autor:
Gorbunov, Eduard, Tupitsa, Nazarii, Choudhury, Sayantan, Aliev, Alen, Richtárik, Peter, Horváth, Samuel, Takáč, Martin
Due to the non-smoothness of optimization problems in Machine Learning, generalized smoothness assumptions have been gaining a lot of attention in recent years. One of the most popular assumptions of this type is $(L_0,L_1)$-smoothness (Zhang et al.,
Externí odkaz:
http://arxiv.org/abs/2409.14989
Autor:
Moskvoretskii, Viktor, Tupitsa, Nazarii, Biemann, Chris, Horváth, Samuel, Gorbunov, Eduard, Nikishina, Irina
We present a new approach based on the Personalized Federated Learning algorithm MeritFed that can be applied to Natural Language Tasks with heterogeneous data. We evaluate it on the Low-Resource Machine Translation task, using the dataset from the L
Externí odkaz:
http://arxiv.org/abs/2406.12564
This work tackles the challenges of data heterogeneity and communication limitations in decentralized federated learning. We focus on creating a collaboration graph that guides each client in selecting suitable collaborators for training personalized
Externí odkaz:
http://arxiv.org/abs/2406.06520
Autor:
Chezhegov, Savelii, Klyukin, Yaroslav, Semenov, Andrei, Beznosikov, Aleksandr, Gasnikov, Alexander, Horváth, Samuel, Takáč, Martin, Gorbunov, Eduard
Methods with adaptive stepsizes, such as AdaGrad and Adam, are essential for training modern Deep Learning models, especially Large Language Models. Typically, the noise in the stochastic gradients is heavy-tailed for the later ones. Gradient clippin
Externí odkaz:
http://arxiv.org/abs/2406.04443
Federated learning (FL) has emerged as a pivotal approach in machine learning, enabling multiple participants to collaboratively train a global model without sharing raw data. While FL finds applications in various domains such as healthcare and fina
Externí odkaz:
http://arxiv.org/abs/2406.00569
Autor:
Li, Yunxiang, Yuan, Rui, Fan, Chen, Schmidt, Mark, Horváth, Samuel, Gower, Robert M., Takáč, Martin
Policy gradient is a widely utilized and foundational algorithm in the field of reinforcement learning (RL). Renowned for its convergence guarantees and stability compared to other RL algorithms, its practical application is often hindered by sensiti
Externí odkaz:
http://arxiv.org/abs/2404.07525
The smart grid domain requires bolstering the capabilities of existing energy management systems; Federated Learning (FL) aligns with this goal as it demonstrates a remarkable ability to train models on heterogeneous datasets while maintaining data p
Externí odkaz:
http://arxiv.org/abs/2403.18439