Zobrazeno 1 - 10
of 1 746
pro vyhledávání: '"A. Takac"'
Autor:
Demidovich, Yury, Ostroukhov, Petr, Malinovsky, Grigory, Horváth, Samuel, Takáč, Martin, Richtárik, Peter, Gorbunov, Eduard
Non-convex Machine Learning problems typically do not adhere to the standard smoothness assumption. Based on empirical findings, Zhang et al. (2020b) proposed a more realistic generalized $(L_0, L_1)$-smoothness assumption, though it remains largely
Externí odkaz:
http://arxiv.org/abs/2412.02781
With the increase in the number of parameters in large language models, the process of pre-training and fine-tuning increasingly demands larger volumes of GPU memory. A significant portion of this memory is typically consumed by the optimizer state.
Externí odkaz:
http://arxiv.org/abs/2411.07837
Learning the structure of Directed Acyclic Graphs (DAGs) presents a significant challenge due to the vast combinatorial search space of possible graphs, which scales exponentially with the number of nodes. Recent advancements have redefined this prob
Externí odkaz:
http://arxiv.org/abs/2410.23862
Autor:
Song, Kun, Solozabal, Ruben, hao, Li, Ren, Lu, Abdar, Moloud, Li, Qing, Karray, Fakhri, Takac, Martin
Hyperbolic representation learning is well known for its ability to capture hierarchical information. However, the distance between samples from different levels of hierarchical classes can be required large. We reveal that the hyperbolic discriminan
Externí odkaz:
http://arxiv.org/abs/2410.22026
Second-order methods for convex optimization outperform first-order methods in terms of theoretical iteration convergence, achieving rates up to $O(k^{-5})$ for highly-smooth functions. However, their practical performance and applications are limite
Externí odkaz:
http://arxiv.org/abs/2410.04083
Non-iid data is prevalent in real-world federated learning problems. Data heterogeneity can come in different types in terms of distribution shifts. In this work, we are interested in the heterogeneity that comes from concept shifts, i.e., shifts in
Externí odkaz:
http://arxiv.org/abs/2410.03497
Statistical data heterogeneity is a significant barrier to convergence in federated learning (FL). While prior work has advanced heterogeneous FL through better optimization objectives, these methods fall short when there is extreme data heterogeneit
Externí odkaz:
http://arxiv.org/abs/2410.03042
We characterize when an Orlicz space $L^A$ is almost compactly (uniformly absolutely continuously) embedded into a Lorentz space $L^{p,q}$ in terms of a balance condition involving parameters $p,q\in[1,\infty]$, and a Young function $A$. In the cours
Externí odkaz:
http://arxiv.org/abs/2410.02495
Autor:
Gorbunov, Eduard, Tupitsa, Nazarii, Choudhury, Sayantan, Aliev, Alen, Richtárik, Peter, Horváth, Samuel, Takáč, Martin
Due to the non-smoothness of optimization problems in Machine Learning, generalized smoothness assumptions have been gaining a lot of attention in recent years. One of the most popular assumptions of this type is $(L_0,L_1)$-smoothness (Zhang et al.,
Externí odkaz:
http://arxiv.org/abs/2409.14989
Autor:
Fares, Samar, Ziu, Klea, Aremu, Toluwani, Durasov, Nikita, Takáč, Martin, Fua, Pascal, Nandakumar, Karthik, Laptev, Ivan
Vision-Language Models (VLMs) are becoming increasingly vulnerable to adversarial attacks as various novel attack strategies are being proposed against these models. While existing defenses excel in unimodal contexts, they currently fall short in saf
Externí odkaz:
http://arxiv.org/abs/2406.09250