Zobrazeno 1 - 10
of 541
pro vyhledávání: '"KAR, SOUMMYA"'
Pre-training Transformer models is resource-intensive, and recent studies have shown that sign momentum is an efficient technique for training large-scale deep learning models, particularly Transformers. However, its application in distributed traini
Externí odkaz:
http://arxiv.org/abs/2411.17866
We study large deviations and mean-squared error (MSE) guarantees of a general framework of nonlinear stochastic gradient methods in the online setting, in the presence of heavy-tailed noise. Unlike existing works that rely on the closed form of a no
Externí odkaz:
http://arxiv.org/abs/2410.15637
Autor:
Armacki, Aleksandar, Yu, Shuhua, Sharma, Pranay, Joshi, Gauri, Bajovic, Dragana, Jakovetic, Dusan, Kar, Soummya
We study high-probability convergence in online learning, in the presence of heavy-tailed noise. To combat the heavy tails, a general framework of nonlinear SGD methods is considered, subsuming several popular nonlinearities like sign, quantization,
Externí odkaz:
http://arxiv.org/abs/2410.13954
The occlusion of the sun by clouds is one of the primary sources of uncertainties in solar power generation, and is a factor that affects the wide-spread use of solar power as a primary energy source. Real-time forecasting of cloud movement and, as a
Externí odkaz:
http://arxiv.org/abs/2409.12016
We develop a clearance and settlement model for Peer-to-Peer (P2P) energy trading in low-voltage networks. The model enables direct transactions between parties within an open and distributed system and integrates unused capacity while respecting net
Externí odkaz:
http://arxiv.org/abs/2407.21403
The rapid adoption of Electric Vehicles (EVs) poses challenges for electricity grids to accommodate or mitigate peak demand. Vehicle-to-Vehicle Charging (V2VC) has been recently adopted by popular EVs, posing new opportunities and challenges to the m
Externí odkaz:
http://arxiv.org/abs/2404.08837
In this paper, we study the problem of ensuring safety with a few shots of samples for partially unknown systems. We first characterize a fundamental limit when producing safe actions is not possible due to insufficient information or samples. Then,
Externí odkaz:
http://arxiv.org/abs/2403.06045
We develop a family of distributed center-based clustering algorithms that work over networks of users. In the proposed scenario, users contain a local dataset and communicate only with their immediate neighbours, with the aim of finding a clustering
Externí odkaz:
http://arxiv.org/abs/2402.01302
Autor:
Armacki, Aleksandar, Sharma, Pranay, Joshi, Gauri, Bajovic, Dragana, Jakovetic, Dusan, Kar, Soummya
We study high-probability convergence guarantees of learning on streaming data in the presence of heavy-tailed noise. In the proposed scenario, the model is updated in an online fashion, as new information is observed, without storing any additional
Externí odkaz:
http://arxiv.org/abs/2310.18784
Motivated by understanding and analysis of large-scale machine learning under heavy-tailed gradient noise, we study decentralized optimization with gradient clipping, i.e., in which certain clipping operators are applied to the gradients or gradient
Externí odkaz:
http://arxiv.org/abs/2310.16920