Výsledky vyhledávání - "Badura, Michal"

Report

Benchmarking Neural Network Training Algorithms

Training algorithms, broadly construed, are an essential part of every deep learning pipeline. Training algorithm improvements that speed up training across a wide variety of workloads (e.g., better update rules, tuning protocols, learning rate sched

Externí odkaz: http://arxiv.org/abs/2306.07179

Zobrazit plný text záznamu

Report

Adaptive Gradient Methods at the Edge of Stability

Autor: Cohen, Jeremy M., Ghorbani, Behrooz, Krishnan, Shankar, Agarwal, Naman, Medapati, Sourabh, Badura, Michal, Suo, Daniel, Cardoze, David, Nado, Zachary, Dahl, George E., Gilmer, Justin

Very little is known about the training dynamics of adaptive gradient methods like Adam in deep learning. In this paper, we shed light on the behavior of these algorithms in the full-batch and sufficiently large batch settings. Specifically, we empir

Externí odkaz: http://arxiv.org/abs/2207.14484

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání