Accelerated Gradient Descent via Long Steps

Autor:	Grimmer, Benjamin, Shu, Kevin, Wang, Alex L.
Rok vydání:	2023
Předmět:	Mathematics - Optimization and Control
Druh dokumentu:	Working Paper
Popis:	Recently Grimmer [1] showed for smooth convex optimization by utilizing longer steps periodically, gradient descent's textbook $LD^2/2T$ convergence guarantees can be improved by constant factors, conjecturing an accelerated rate strictly faster than $O(1/T)$ could be possible. Here we prove such a big-O gain, establishing gradient descent's first accelerated convergence rate in this setting. Namely, we prove a $O(1/T^{1.0564})$ rate for smooth convex minimization by utilizing a nonconstant nonperiodic sequence of increasingly large stepsizes. It remains open if one can achieve the $O(1/T^{1.178})$ rate conjectured by Das Gupta et. al. [2] or the optimal gradient method rate of $O(1/T^2)$. Big-O convergence rate accelerations from long steps follow from our theory for strongly convex optimization, similar to but somewhat weaker than those concurrently developed by Altschuler and Parrilo [3].
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2309.09961 Zobrazit plný text záznamu View this record from Arxiv