Zobrazeno 1 - 10
of 418
pro vyhledávání: '"Papyan, A."'
Autor:
Zhang, Stephen, Papyan, Vardan
The recent paradigm shift to large-scale foundation models has brought about a new era for deep learning that, while has found great success in practice, has also been plagued by prohibitively expensive costs in terms of high memory consumption and c
Externí odkaz:
http://arxiv.org/abs/2409.13652
Large Language Models (LLMs) have made significant strides in natural language processing, and a precise understanding of the internal mechanisms driving their success is essential. In this work, we trace the trajectories of individual tokens as they
Externí odkaz:
http://arxiv.org/abs/2407.07810
Autor:
Zhang, Stephen, Papyan, Vardan
Pruning has emerged as a promising approach for compressing large-scale models, yet its effectiveness in recovering the sparsest of models has not yet been explored. We conducted an extensive series of 485,838 experiments, applying a range of state-o
Externí odkaz:
http://arxiv.org/abs/2407.04075
Large Language Models (LLMs) are vulnerable to jailbreaks$\unicode{x2013}$methods to elicit harmful or generally impermissible outputs. Safety measures are developed and assessed on their effectiveness at defending against jailbreak attacks, indicati
Externí odkaz:
http://arxiv.org/abs/2407.02551
Publikováno v:
Pattern Recognit. Image Anal. 34 (2024)
In this survey we are focusing on utilizing drone-based systems for the detection of individuals, particularly by identifying human screams and other distress signals. This study has significant relevance in post-disaster scenarios, including events
Externí odkaz:
http://arxiv.org/abs/2406.15875
Autor:
Wu, Robert, Papyan, Vardan
Neural collapse ($\mathcal{NC}$) is a phenomenon observed in classification tasks where top-layer representations collapse into their class means, which become equinorm, equiangular and aligned with the classifiers. These behaviors -- associated with
Externí odkaz:
http://arxiv.org/abs/2405.17767
Mixup is a data augmentation strategy that employs convex combinations of training instances and their respective labels to augment the robustness and calibration of deep neural networks. Despite its widespread adoption, the nuanced mechanisms that u
Externí odkaz:
http://arxiv.org/abs/2402.06171
Designing deep neural network classifiers that perform robustly on distributions differing from the available training data is an active area of machine learning research. However, out-of-distribution generalization for regression-the analogous probl
Externí odkaz:
http://arxiv.org/abs/2312.17463
Large language models (LLMs) have exhibited impressive capabilities in comprehending complex instructions. However, their blind adherence to provided instructions has led to concerns regarding risks of malicious use. Existing defence mechanisms, such
Externí odkaz:
http://arxiv.org/abs/2307.10719
Publikováno v:
Российский офтальмологический журнал, Vol 17, Iss 2, Pp 128-134 (2024)
The review presents recent research works on new technologies of scleral collagen crosslinking, a promising approach to sclerastrengthening treatment of progressive myopia. We assess the advantages and limitations of a number of experimental techniqu
Externí odkaz:
https://doaj.org/article/0c6b4d42043842058a6653b00ce735f2