Gaussian smoothing gradient descent for minimizing functions (GSmoothGD)
Autor: | Starnes, Andrew, Dereventsov, Anton, Webster, Clayton |
---|---|
Rok vydání: | 2023 |
Předmět: | |
Druh dokumentu: | Working Paper |
Popis: | This work analyzes the convergence of a class of smoothing-based gradient descent methods when applied to optimization problems. In particular, Gaussian smoothing is employed to define a nonlocal gradient that reduces high-frequency noise, small variations, and rapid fluctuations in the computation of the descent directions while preserving the structure and features of the loss landscape. The resulting Gaussian smoothing gradient descent (GSmoothGD) approach can facilitate gradient descent in navigating away from and avoiding local minima with increased ease, thereby substantially enhancing its overall performance even when applied to non-convex optimization problems. This work also provides rigorous theoretical error estimates on the rate of convergence of GSmoothGD iterates. These estimates exemplify the impact of underlying function convexity, smoothness, input dimension, and the Gaussian smoothing radius. To combat the curse of dimensionality, we numerically approximate the GSmoothGD nonlocal gradient using Monte Carlo (MC) sampling and provide a theory in which the iterates converge regardless of the function smoothness and dimension. Finally, we present several strategies to update the smoothing parameter aimed at diminishing the impact of local minima, thereby rendering the attainment of global minima more achievable. Computational evidence complements the present theory and shows the effectiveness of the MC-GSmoothGD method compared to other smoothing-based algorithms, momentum-based approaches, and classical gradient-based algorithms from numerical optimization. Comment: 29 pages, 2 figures, 2 tables |
Databáze: | arXiv |
Externí odkaz: |