Zobrazeno 1 - 2
of 2
pro vyhledávání: '"Chaubard, Francois"'
We introduce Gradient Agreement Filtering (GAF) to improve on gradient averaging in distributed deep learning optimization. Traditional distributed data-parallel stochastic gradient descent involves averaging gradients of microbatches to calculate a
Externí odkaz:
http://arxiv.org/abs/2412.18052