Zobrazeno 1 - 10
of 168
pro vyhledávání: '"Drusvyatskiy, Dmitriy"'
A prevalent belief among optimization specialists is that linear convergence of gradient descent is contingent on the function growing quadratically away from its minimizers. In this work, we argue that this belief is inaccurate. We show that gradien
Externí odkaz:
http://arxiv.org/abs/2409.19791
Classical results in asymptotic statistics show that the Fisher information matrix controls the difficulty of estimating a statistical model from observed data. In this work, we introduce a companion measure of robustness of an estimation problem: th
Externí odkaz:
http://arxiv.org/abs/2405.09676
A fundamental problem in machine learning is to understand how neural networks make accurate predictions, while seemingly bypassing the curse of dimensionality. A possible explanation is that common training algorithms for neural networks implicitly
Externí odkaz:
http://arxiv.org/abs/2401.04553
Modern machine learning paradigms, such as deep learning, occur in or close to the interpolation regime, wherein the number of model parameters is much larger than the number of data samples. In this work, we propose a regularity condition within the
Externí odkaz:
http://arxiv.org/abs/2306.02601
We show that the deviation between the slopes of two convex functions controls the deviation between the functions themselves. This result reveals that the slope -- a one dimensional construct -- robustly determines convex functions, up to a constant
Externí odkaz:
http://arxiv.org/abs/2303.16277
In their seminal work, Polyak and Juditsky showed that stochastic approximation algorithms for solving smooth equations enjoy a central limit theorem. Moreover, it has since been argued that the asymptotic covariance of the method is best possible am
Externí odkaz:
http://arxiv.org/abs/2301.06632
Autor:
Davis, Damek1 (AUTHOR), Drusvyatskiy, Dmitriy2 (AUTHOR) ddrusv@uw.edu, Charisopoulos, Vasileios1 (AUTHOR)
Publikováno v:
Mathematical Programming. Sep2024, Vol. 207 Issue 1/2, p145-190. 46p.
Publikováno v:
Journal of Machine Learning Research, 25(90):1-49, 2024
We analyze a stochastic approximation algorithm for decision-dependent problems, wherein the data distribution used by the algorithm evolves along the iterate sequence. The primary examples of such problems appear in performative prediction and its m
Externí odkaz:
http://arxiv.org/abs/2207.04173
This paper studies the problem of expected loss minimization given a data distribution that is dependent on the decision-maker's action and evolves dynamically in time according to a geometric decay process. Novel algorithms for both the information
Externí odkaz:
http://arxiv.org/abs/2204.08281
Empirical evidence suggests that for a variety of overparameterized nonlinear models, most notably in neural network training, the growth of the loss around a minimizer strongly impacts its performance. Flat minima -- those around which the loss grow
Externí odkaz:
http://arxiv.org/abs/2203.03756