Zobrazeno 1 - 10
of 492
pro vyhledávání: '"Choromanska, A."'
This work focuses on the decentralized deep learning optimization framework. We propose Adjacent Leader Decentralized Gradient Descent (AL-DSGD), for improving final model performance, accelerating convergence, and reducing the communication overhead
Externí odkaz:
http://arxiv.org/abs/2405.11389
Autor:
Dimlioglu, Tolga, Choromanska, Anna
We study distributed training of deep learning models in time-constrained environments. We propose a new algorithm that periodically pulls workers towards the center variable computed as a weighted average of workers, where the weights are inversely
Externí odkaz:
http://arxiv.org/abs/2403.04206
The goal of lifelong learning is to continuously learn from non-stationary distributions, where the non-stationarity is typically imposed by a sequence of distinct tasks. Prior works have mostly considered idealistic settings, where the identity of t
Externí odkaz:
http://arxiv.org/abs/2210.03869
Autor:
Fang, Shihong, Zhu, Haoran, Bisla, Devansh, Choromanska, Anna, Ravindran, Satish, Ren, Dongyin, Wu, Ryan
Among various sensors for assisted and autonomous driving systems, automotive radar has been considered as a robust and low-cost solution even in adverse weather or lighting conditions. With the recent development of radar technologies and open-sourc
Externí odkaz:
http://arxiv.org/abs/2209.12940
Autor:
Choromańska, Anna, Szwedowicz, Urszula, Szewczyk, Anna, Daczewska, Małgorzata, Saczko, Jolanta, Kruszakin, Roksana, Pawlik, Krzysztof J., Baczyńska, Dagmara, Kulbacka, Julita
Publikováno v:
In BBA - General Subjects December 2024 1868(12)
In this paper, we study the sharpness of a deep learning (DL) loss landscape around local minima in order to reveal systematic mechanisms underlying the generalization abilities of DL models. Our analysis is performed across varying network and optim
Externí odkaz:
http://arxiv.org/abs/2201.08025
Autor:
McShea, Bernard, Wright, Kevin, Lam, Denley, Schmidt, Steve, Choromanska, Anna, Bisla, Devansh, Fang, Shihong, Sarmadi, Alireza, Krishnamurthy, Prashanth, Khorrami, Farshad
Securing enterprise networks presents challenges in terms of both their size and distributed structure. Data required to detect and characterize malicious activities may be diffused and may be located across network and endpoint devices. Further, cyb
Externí odkaz:
http://arxiv.org/abs/2112.04114
Modern deep learning (DL) architectures are trained using variants of the SGD algorithm that is run with a $\textit{manually}$ defined learning rate schedule, i.e., the learning rate is dropped at the pre-defined epochs, typically when the training l
Externí odkaz:
http://arxiv.org/abs/2111.15317
This paper focuses on understanding how the generalization error scales with the amount of the training data for deep neural networks (DNNs). Existing techniques in statistical learning require computation of capacity measures, such as VC dimension,
Externí odkaz:
http://arxiv.org/abs/2105.01867
This paper studies a new design of the optimization algorithm for training deep learning models with a fixed architecture of the classification network in a continual learning framework. The training data is non-stationary and the non-stationarity is
Externí odkaz:
http://arxiv.org/abs/2011.12581