Zobrazeno 1 - 10
of 116
pro vyhledávání: '"Dauphin, Yann"'
Autor:
Nowak, Aleksandra I., Mercea, Otniel-Bogdan, Arnab, Anurag, Pfeiffer, Jonas, Dauphin, Yann, Evci, Utku
Parameter-efficient transfer learning (PETL) aims to adapt pre-trained models to new downstream tasks while minimizing the number of fine-tuned parameters. Adapters, a popular approach in PETL, inject additional capacity into existing networks by inc
Externí odkaz:
http://arxiv.org/abs/2410.15858
Recent work has shown that methods like SAM which either explicitly or implicitly penalize second order information can improve generalization in deep learning. Seemingly similar methods like weight noise and gradient penalties often fail to provide
Externí odkaz:
http://arxiv.org/abs/2401.10809
Learning from human feedback (LHF) -- and in particular learning from pairwise preferences -- has recently become a crucial ingredient in training large language models (LLMs), and has been the subject of much research. Most recent works frame it as
Externí odkaz:
http://arxiv.org/abs/2311.14115
We present the NeurIPS 2021 consistency experiment, a larger-scale variant of the 2014 NeurIPS experiment in which 10% of conference submissions were reviewed by two independent committees to quantify the randomness in the review process. We observe
Externí odkaz:
http://arxiv.org/abs/2306.03262
Data augmentation methods have played an important role in the recent advance of deep learning models, and have become an indispensable component of state-of-the-art models in semi-supervised, self-supervised, and supervised training for vision. Desp
Externí odkaz:
http://arxiv.org/abs/2305.13520
Autor:
Lee, Joo Hyung, Park, Wonpyo, Mitchell, Nicole, Pilault, Jonathan, Obando-Ceron, Johan, Kim, Han-Byul, Lee, Namhoon, Frantar, Elias, Long, Yun, Yazdanbakhsh, Amir, Agrawal, Shivani, Subramanian, Suvinay, Wang, Xin, Kao, Sheng-Chun, Zhang, Xingyao, Gale, Trevor, Bik, Aart, Han, Woohyun, Ferev, Milen, Han, Zhonglin, Kim, Hong-Seok, Dauphin, Yann, Dziugaite, Gintare Karolina, Castro, Pablo Samuel, Evci, Utku
This paper introduces JaxPruner, an open-source JAX-based pruning and sparse training library for machine learning research. JaxPruner aims to accelerate research on sparse neural networks by providing concise implementations of popular pruning and s
Externí odkaz:
http://arxiv.org/abs/2304.14082
Deep networks have achieved impressive results on a range of well-curated benchmark datasets. Surprisingly, their performance remains sensitive to perturbations that have little effect on human performance. In this work, we propose a novel extension
Externí odkaz:
http://arxiv.org/abs/2304.02847
Autor:
Agarwala, Atish, Dauphin, Yann N.
The Sharpness Aware Minimization (SAM) optimization algorithm has been shown to control large eigenvalues of the loss Hessian and provide generalization benefits in a variety of settings. The original motivation for SAM was a modified loss function w
Externí odkaz:
http://arxiv.org/abs/2302.08692
Autor:
Rastogi, Charvi, Stelmakh, Ivan, Beygelzimer, Alina, Dauphin, Yann N., Liang, Percy, Vaughan, Jennifer Wortman, Xue, Zhenyu, Daumé III, Hal, Pierson, Emma, Shah, Nihar B.
How do author perceptions match up to the outcomes of the peer-review process and perceptions of others? In a top-tier computer science conference (NeurIPS 2021) with more than 23,000 submitting authors and 9,000 submitted papers, we survey the autho
Externí odkaz:
http://arxiv.org/abs/2211.12966
Publikováno v:
International Conference on Learning Representations (ICLR) 2022
Despite being able to capture a range of features of the data, high accuracy models trained with supervision tend to make similar predictions. This seemingly implies that high-performing models share similar biases regardless of training methodology,
Externí odkaz:
http://arxiv.org/abs/2110.12899