A Variant of Gradient Descent Algorithm Based on Gradient Averaging

Autor: Purkayastha, Saugata, Purkayastha, Sukannya
Rok vydání: 2020
Předmět:
Druh dokumentu: Working Paper
Popis: In this work, we study an optimizer, Grad-Avg to optimize error functions. We establish the convergence of the sequence of iterates of Grad-Avg mathematically to a minimizer (under boundedness assumption). We apply Grad-Avg along with some of the popular optimizers on regression as well as classification tasks. In regression tasks, it is observed that the behaviour of Grad-Avg is almost identical with Stochastic Gradient Descent (SGD). We present a mathematical justification of this fact. In case of classification tasks, it is observed that the performance of Grad-Avg can be enhanced by suitably scaling the parameters. Experimental results demonstrate that Grad-Avg converges faster than the other state-of-the-art optimizers for the classification task on two benchmark datasets.
Comment: 9 pages, 4 figures. Accepted at OPT2020: 12th Annual Workshop on Optimization for Machine Learning @ NeurIPS, 2020
Databáze: arXiv