A Variant of Gradient Descent Algorithm Based on Gradient Averaging

Autor:	Purkayastha, Saugata, Purkayastha, Sukannya
Rok vydání:	2020
Předmět:	Computer Science - Machine Learning Mathematics - Optimization and Control Statistics - Machine Learning
Druh dokumentu:	Working Paper
Popis:	In this work, we study an optimizer, Grad-Avg to optimize error functions. We establish the convergence of the sequence of iterates of Grad-Avg mathematically to a minimizer (under boundedness assumption). We apply Grad-Avg along with some of the popular optimizers on regression as well as classification tasks. In regression tasks, it is observed that the behaviour of Grad-Avg is almost identical with Stochastic Gradient Descent (SGD). We present a mathematical justification of this fact. In case of classification tasks, it is observed that the performance of Grad-Avg can be enhanced by suitably scaling the parameters. Experimental results demonstrate that Grad-Avg converges faster than the other state-of-the-art optimizers for the classification task on two benchmark datasets. Comment: 9 pages, 4 figures. Accepted at OPT2020: 12th Annual Workshop on Optimization for Machine Learning @ NeurIPS, 2020
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2012.02387 Zobrazit plný text záznamu View this record from Arxiv