Výsledky vyhledávání - "Takáč, Martin"

Dissertation/ Thesis

Randomized coordinate descent methods for big data optimization

Autor: Takac, Martin

This thesis consists of 5 chapters. We develop new serial (Chapter 2), parallel (Chapter 3), distributed (Chapter 4) and primal-dual (Chapter 5) stochastic (randomized) coordinate descent methods, analyze their complexity and conduct numerical experi

Externí odkaz: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.630403

Zobrazit plný text záznamu

Report

FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for Scalable Training

Autor: Zmushko, Philip, Beznosikov, Aleksandr, Takáč, Martin, Horváth, Samuel

With the increase in the number of parameters in large language models, the process of pre-training and fine-tuning increasingly demands larger volumes of GPU memory. A significant portion of this memory is typically consumed by the optimizer state.

Externí odkaz: http://arxiv.org/abs/2411.07837

Zobrazit plný text záznamu

Report

$\psi$DAG: Projected Stochastic Approximation Iteration for DAG Structure Learning

Autor: Ziu, Klea, Hanzely, Slavomír, Li, Loka, Zhang, Kun, Takáč, Martin, Kamzolov, Dmitry

Learning the structure of Directed Acyclic Graphs (DAGs) presents a significant challenge due to the vast combinatorial search space of possible graphs, which scales exponentially with the number of nodes. Recent advancements have redefined this prob

Externí odkaz: http://arxiv.org/abs/2410.23862

Zobrazit plný text záznamu

Report

Enhance Hyperbolic Representation Learning via Second-order Pooling

Autor: Song, Kun, Solozabal, Ruben, hao, Li, Ren, Lu, Abdar, Moloud, Li, Qing, Karray, Fakhri, Takac, Martin

Hyperbolic representation learning is well known for its ability to capture hierarchical information. However, the distance between samples from different levels of hierarchical classes can be required large. We reveal that the hyperbolic discriminan

Externí odkaz: http://arxiv.org/abs/2410.22026

Zobrazit plný text záznamu

Report

OPTAMI: Global Superlinear Convergence of High-order Methods

Autor: Kamzolov, Dmitry, Pasechnyuk, Dmitry, Agafonov, Artem, Gasnikov, Alexander, Takáč, Martin

Second-order methods for convex optimization outperform first-order methods in terms of theoretical iteration convergence, achieving rates up to $O(k^{-5})$ for highly-smooth functions. However, their practical performance and applications are limite

Externí odkaz: http://arxiv.org/abs/2410.04083

Zobrazit plný text záznamu

Report

Collaborative and Efficient Personalization with Mixtures of Adaptors

Autor: Almansoori, Abdulla Jasem, Horváth, Samuel, Takáč, Martin

Non-iid data is prevalent in real-world federated learning problems. Data heterogeneity can come in different types in terms of distribution shifts. In this work, we are interested in the heterogeneity that comes from concept shifts, i.e., shifts in

Externí odkaz: http://arxiv.org/abs/2410.03497

Zobrazit plný text záznamu

Report

FedPeWS: Personalized Warmup via Subnetworks for Enhanced Heterogeneous Federated Learning

Autor: Tastan, Nurbek, Horvath, Samuel, Takac, Martin, Nandakumar, Karthik

Statistical data heterogeneity is a significant barrier to convergence in federated learning (FL). While prior work has advanced heterogeneous FL through better optimization objectives, these methods fall short when there is extreme data heterogeneit

Externí odkaz: http://arxiv.org/abs/2410.03042

Zobrazit plný text záznamu

Report

Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity

Autor: Gorbunov, Eduard, Tupitsa, Nazarii, Choudhury, Sayantan, Aliev, Alen, Richtárik, Peter, Horváth, Samuel, Takáč, Martin

Due to the non-smoothness of optimization problems in Machine Learning, generalized smoothness assumptions have been gaining a lot of attention in recent years. One of the most popular assumptions of this type is $(L_0,L_1)$-smoothness (Zhang et al.,

Externí odkaz: http://arxiv.org/abs/2409.14989

Zobrazit plný text záznamu

Report

MirrorCheck: Efficient Adversarial Defense for Vision-Language Models

Autor: Fares, Samar, Ziu, Klea, Aremu, Toluwani, Durasov, Nikita, Takáč, Martin, Fua, Pascal, Nandakumar, Karthik, Laptev, Ivan

Vision-Language Models (VLMs) are becoming increasingly vulnerable to adversarial attacks as various novel attack strategies are being proposed against these models. While existing defenses excel in unimodal contexts, they currently fall short in saf

Externí odkaz: http://arxiv.org/abs/2406.09250

Zobrazit plný text záznamu

Report

Gradient Clipping Improves AdaGrad when the Noise Is Heavy-Tailed

Autor: Chezhegov, Savelii, Klyukin, Yaroslav, Semenov, Andrei, Beznosikov, Aleksandr, Gasnikov, Alexander, Horváth, Samuel, Takáč, Martin, Gorbunov, Eduard

Methods with adaptive stepsizes, such as AdaGrad and Adam, are essential for training modern Deep Learning models, especially Large Language Models. Typically, the noise in the stochastic gradients is heavy-tailed for the later ones. Gradient clippin

Externí odkaz: http://arxiv.org/abs/2406.04443

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání