Zobrazeno 1 - 10
of 277
pro vyhledávání: '"Takáč, Martin"'
Autor:
Takac, Martin
This thesis consists of 5 chapters. We develop new serial (Chapter 2), parallel (Chapter 3), distributed (Chapter 4) and primal-dual (Chapter 5) stochastic (randomized) coordinate descent methods, analyze their complexity and conduct numerical experi
Externí odkaz:
http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.630403
With the increase in the number of parameters in large language models, the process of pre-training and fine-tuning increasingly demands larger volumes of GPU memory. A significant portion of this memory is typically consumed by the optimizer state.
Externí odkaz:
http://arxiv.org/abs/2411.07837
Learning the structure of Directed Acyclic Graphs (DAGs) presents a significant challenge due to the vast combinatorial search space of possible graphs, which scales exponentially with the number of nodes. Recent advancements have redefined this prob
Externí odkaz:
http://arxiv.org/abs/2410.23862
Autor:
Song, Kun, Solozabal, Ruben, hao, Li, Ren, Lu, Abdar, Moloud, Li, Qing, Karray, Fakhri, Takac, Martin
Hyperbolic representation learning is well known for its ability to capture hierarchical information. However, the distance between samples from different levels of hierarchical classes can be required large. We reveal that the hyperbolic discriminan
Externí odkaz:
http://arxiv.org/abs/2410.22026
Second-order methods for convex optimization outperform first-order methods in terms of theoretical iteration convergence, achieving rates up to $O(k^{-5})$ for highly-smooth functions. However, their practical performance and applications are limite
Externí odkaz:
http://arxiv.org/abs/2410.04083
Non-iid data is prevalent in real-world federated learning problems. Data heterogeneity can come in different types in terms of distribution shifts. In this work, we are interested in the heterogeneity that comes from concept shifts, i.e., shifts in
Externí odkaz:
http://arxiv.org/abs/2410.03497
Statistical data heterogeneity is a significant barrier to convergence in federated learning (FL). While prior work has advanced heterogeneous FL through better optimization objectives, these methods fall short when there is extreme data heterogeneit
Externí odkaz:
http://arxiv.org/abs/2410.03042
Autor:
Gorbunov, Eduard, Tupitsa, Nazarii, Choudhury, Sayantan, Aliev, Alen, Richtárik, Peter, Horváth, Samuel, Takáč, Martin
Due to the non-smoothness of optimization problems in Machine Learning, generalized smoothness assumptions have been gaining a lot of attention in recent years. One of the most popular assumptions of this type is $(L_0,L_1)$-smoothness (Zhang et al.,
Externí odkaz:
http://arxiv.org/abs/2409.14989
Autor:
Fares, Samar, Ziu, Klea, Aremu, Toluwani, Durasov, Nikita, Takáč, Martin, Fua, Pascal, Nandakumar, Karthik, Laptev, Ivan
Vision-Language Models (VLMs) are becoming increasingly vulnerable to adversarial attacks as various novel attack strategies are being proposed against these models. While existing defenses excel in unimodal contexts, they currently fall short in saf
Externí odkaz:
http://arxiv.org/abs/2406.09250
Autor:
Chezhegov, Savelii, Klyukin, Yaroslav, Semenov, Andrei, Beznosikov, Aleksandr, Gasnikov, Alexander, Horváth, Samuel, Takáč, Martin, Gorbunov, Eduard
Methods with adaptive stepsizes, such as AdaGrad and Adam, are essential for training modern Deep Learning models, especially Large Language Models. Typically, the noise in the stochastic gradients is heavy-tailed for the later ones. Gradient clippin
Externí odkaz:
http://arxiv.org/abs/2406.04443