Zobrazeno 1 - 10
of 21
pro vyhledávání: '"Can, Bugra"'
Autor:
Can, Bugra, Gürbüzbalaban, Mert
In the context of first-order algorithms subject to random gradient noise, we study the trade-offs between the convergence rate (which quantifies how fast the initial conditions are forgotten) and the "risk" of suboptimality, i.e. deviations from the
Externí odkaz:
http://arxiv.org/abs/2204.11292
In this work, we consider strongly convex strongly concave (SCSC) saddle point (SP) problems $\min_{x\in\mathbb{R}^{d_x}}\max_{y\in\mathbb{R}^{d_y}}f(x,y)$ where $f$ is $L$-smooth, $f(.,y)$ is $\mu$-strongly convex for every $y$, and $f(x,.)$ is $\mu
Externí odkaz:
http://arxiv.org/abs/2202.09688
This work proposes a distributed algorithm for solving empirical risk minimization problems, called L-DQN, under the master/worker communication model. L-DQN is a distributed limited-memory quasi-Newton method that supports asynchronous computations
Externí odkaz:
http://arxiv.org/abs/2108.09365
This work proposes a time-efficient Natural Gradient Descent method, called TENGraD, with linear convergence guarantees. Computing the inverse of the neural network's Fisher information matrix is expensive in NGD because the Fisher matrix is large. A
Externí odkaz:
http://arxiv.org/abs/2106.03947
Autor:
Arjevani, Yossi, Bruna, Joan, Can, Bugra, Gürbüzbalaban, Mert, Jegelka, Stefanie, Lin, Hongzhou
We introduce a framework for designing primal methods under the decentralized optimization setting where local functions are smooth and strongly convex. Our approach consists of approximately solving a sequence of sub-problems induced by the accelera
Externí odkaz:
http://arxiv.org/abs/2006.06733
Autor:
Can, Bugra, Caglar, Mine
Sticky Brownian motion on the real line can be obtained as a weak solution of a system of stochastic differential equations. We find the conditional distribution of the process given the driving Brownian motion, both at an independent exponential tim
Externí odkaz:
http://arxiv.org/abs/1910.10213
The effective resistance between a pair of nodes in a weighted undirected graph is defined as the potential difference induced when a unit current is injected at one node and extracted from the other, treating edge weights as the conductance values o
Externí odkaz:
http://arxiv.org/abs/1907.13110
ASYNC is a framework that supports the implementation of asynchrony and history for optimization methods on distributed computing platforms. The popularity of asynchronous optimization methods has increased in distributed machine learning. However, t
Externí odkaz:
http://arxiv.org/abs/1907.08526
Publikováno v:
International Conference on Machine Learning 2019, 891-901
Momentum methods such as Polyak's heavy ball (HB) method, Nesterov's accelerated gradient (AG) as well as accelerated projected gradient (APG) method have been commonly used in machine learning practice, but their performance is quite sensitive to no
Externí odkaz:
http://arxiv.org/abs/1901.07445
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.