Výsledky vyhledávání - "Gurbuzbalaban, Mert"

Report

Robustly Stable Accelerated Momentum Methods With A Near-Optimal L2 Gain and $H_\infty$ Performance

Autor: Gurbuzbalaban, Mert

We consider the problem of minimizing a strongly convex smooth function where the gradients are subject to additive worst-case deterministic errors that are square-summable. We study the trade-offs between the convergence rate and robustness to gradi

Externí odkaz: http://arxiv.org/abs/2309.11481

Zobrazit plný text záznamu

Report

Accelerated gradient methods for nonconvex optimization: Escape trajectories from strict saddle points and convergence to local minima

Autor: Dixit, Rishabh, Gurbuzbalaban, Mert, Bajwa, Waheed U.

This paper considers the problem of understanding the behavior of a general class of accelerated gradient methods on smooth nonconvex functions. Motivated by some recent works that have proposed effective algorithms, based on Polyak's heavy ball meth

Externí odkaz: http://arxiv.org/abs/2307.07030

Zobrazit plný text záznamu

Report

Uniform-in-Time Wasserstein Stability Bounds for (Noisy) Stochastic Gradient Descent

Autor: Zhu, Lingjiong, Gurbuzbalaban, Mert, Raj, Anant, Simsekli, Umut

Algorithmic stability is an important notion that has proven powerful for deriving generalization bounds for practical algorithms. The last decade has witnessed an increasing number of stability bounds for different algorithms applied on different cl

Externí odkaz: http://arxiv.org/abs/2305.12056

Zobrazit plný text záznamu

Report

Heavy-Tail Phenomenon in Decentralized SGD

Autor: Gurbuzbalaban, Mert, Hu, Yuanhan, Simsekli, Umut, Yuan, Kun, Zhu, Lingjiong

Recent theoretical studies have shown that heavy-tails can emerge in stochastic optimization due to `multiplicative noise', even under surprisingly simple settings, such as linear regression with Gaussian data. While these studies have uncovered seve

Externí odkaz: http://arxiv.org/abs/2205.06689

Zobrazit plný text záznamu

Report

A Variance-Reduced Stochastic Accelerated Primal Dual Algorithm

Autor: Can, Bugra, Gurbuzbalaban, Mert, Aybat, Necdet Serhat

In this work, we consider strongly convex strongly concave (SCSC) saddle point (SP) problems $\min_{x\in\mathbb{R}^{d_x}}\max_{y\in\mathbb{R}^{d_y}}f(x,y)$ where $f$ is $L$-smooth, $f(.,y)$ is $\mu$-strongly convex for every $y$, and $f(x,.)$ is $\mu

Externí odkaz: http://arxiv.org/abs/2202.09688

Zobrazit plný text záznamu

Report

Boundary Conditions for Linear Exit Time Gradient Trajectories Around Saddle Points: Analysis and Algorithm

Autor: Dixit, Rishabh, Gurbuzbalaban, Mert, Bajwa, Waheed U.

Gradient-related first-order methods have become the workhorse of large-scale numerical optimization problems. Many of these problems involve nonconvex objective functions with multiple saddle points, which necessitates an understanding of the behavi

Externí odkaz: http://arxiv.org/abs/2101.02625

Zobrazit plný text záznamu

Report

Differentially Private Accelerated Optimization Algorithms

Autor: Kuru, Nurdan, Birbil, Ş. İlker, Gurbuzbalaban, Mert, Yildirim, Sinan

Publikováno v: SIAM Journal on Optimization 2022 32:2, 795-821

We present two classes of differentially private optimization algorithms derived from the well-known accelerated first-order methods. The first algorithm is inspired by Polyak's heavy ball method and employs a smoothing approach to decrease the accum

Externí odkaz: http://arxiv.org/abs/2008.01989

Zobrazit plný text záznamu

Report

The Heavy-Tail Phenomenon in SGD

Autor: Gurbuzbalaban, Mert, Şimşekli, Umut, Zhu, Lingjiong

Publikováno v: Published as a conference paper at International Conference on Machine Learning (ICML) 2021

In recent years, various notions of capacity and complexity have been proposed for characterizing the generalization properties of stochastic gradient descent (SGD) in deep learning. Some of the popular notions that correlate well with the performanc

Externí odkaz: http://arxiv.org/abs/2006.04740

Zobrazit plný text záznamu

Report

Exit Time Analysis for Approximations of Gradient Descent Trajectories Around Saddle Points

Autor: Dixit, Rishabh, Gurbuzbalaban, Mert, Bajwa, Waheed U.

Publikováno v: Information and Inference: A Journal of the IMA, vol. 12, no. 2, pp. 714-786, Jun. 2023

This paper considers the problem of understanding the exit time for trajectories of gradient-related first-order methods from saddle neighborhoods under some initial boundary conditions. Given the 'flat' geometry around saddle points, first-order met

Externí odkaz: http://arxiv.org/abs/2006.01106

Zobrazit plný text záznamu

Plný text ve formátu HTML

Report

Fractional moment-preserving initialization schemes for training deep neural networks

Autor: Gurbuzbalaban, Mert, Hu, Yuanhan

A traditional approach to initialization in deep neural networks (DNNs) is to sample the network weights randomly for preserving the variance of pre-activations. On the other hand, several studies show that during the training process, the distributi

Externí odkaz: http://arxiv.org/abs/2005.11878

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání