Accelerating SGD for Distributed Deep-Learning Using Approximated Hessian Matrix

Autor:	Arnold, S��bastien M. R., Wang, Chunming
Rok vydání:	2017
Předmět:	FOS: Computer and information sciences MathematicsofComputing_NUMERICALANALYSIS Machine Learning (cs.LG)
DOI:	10.48550/arxiv.1709.05069
Popis:	We introduce a novel method to compute a rank $m$ approximation of the inverse of the Hessian matrix in the distributed regime. By leveraging the differences in gradients and parameters of multiple Workers, we are able to efficiently implement a distributed approximation of the Newton-Raphson method. We also present preliminary results which underline advantages and challenges of second-order methods for large stochastic optimization problems. In particular, our work suggests that novel strategies for combining gradients provide further information on the loss surface. ICLR17 Workshop Track
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::458c20801973ae10f349244409264d6c Zobrazit plný text záznamu