Zobrazeno 1 - 1
of 1
pro vyhledávání: '"Girimaji, Vrinda S."'
Publikováno v:
NeurIPS 2024 (Main Conference)
Within distributed learning, workers typically compute gradients on their assigned dataset chunks and send them to the parameter server (PS), which aggregates them to compute either an exact or approximate version of $\nabla L$ (gradient of the loss
Externí odkaz:
http://arxiv.org/abs/2405.19509