Zobrazeno 1 - 10
of 42
pro vyhledávání: '"Jain, Neel"'
Autor:
White, Colin, Dooley, Samuel, Roberts, Manley, Pal, Arka, Feuer, Ben, Jain, Siddhartha, Shwartz-Ziv, Ravid, Jain, Neel, Saifullah, Khalid, Naidu, Siddartha, Hegde, Chinmay, LeCun, Yann, Goldstein, Tom, Neiswanger, Willie, Goldblum, Micah
Test set contamination, wherein test data from a benchmark ends up in a newer model's training set, is a well-documented obstacle for fair LLM evaluation and can quickly render benchmarks obsolete. To mitigate this, many recent benchmarks crowdsource
Externí odkaz:
http://arxiv.org/abs/2406.19314
Autor:
Chen, Jiuhai, Qadri, Rifaa, Wen, Yuxin, Jain, Neel, Kirchenbauer, John, Zhou, Tianyi, Goldstein, Tom
Most public instruction finetuning datasets are relatively small compared to the closed source datasets used to train industry models. To study questions about finetuning at scale, such as curricula and learning rate cooldown schedules, there is a ne
Externí odkaz:
http://arxiv.org/abs/2406.10323
Autor:
Hans, Abhimanyu, Wen, Yuxin, Jain, Neel, Kirchenbauer, John, Kazemi, Hamid, Singhania, Prajwal, Singh, Siddharth, Somepalli, Gowthami, Geiping, Jonas, Bhatele, Abhinav, Goldstein, Tom
Large language models can memorize and repeat their training data, causing privacy and copyright risks. To mitigate memorization, we introduce a subtle modification to the next-token training objective that we call the goldfish loss. During training,
Externí odkaz:
http://arxiv.org/abs/2406.10209
Autor:
McLeish, Sean, Bansal, Arpit, Stein, Alex, Jain, Neel, Kirchenbauer, John, Bartoldson, Brian R., Kailkhura, Bhavya, Bhatele, Abhinav, Geiping, Jonas, Schwarzschild, Avi, Goldstein, Tom
The poor performance of transformers on arithmetic tasks seems to stem in large part from their inability to keep track of the exact position of each digit inside of a large span of digits. We mend this problem by adding an embedding to each digit th
Externí odkaz:
http://arxiv.org/abs/2405.17399
Autor:
Jain, Neel, Chiang, Ping-yeh, Wen, Yuxin, Kirchenbauer, John, Chu, Hong-Min, Somepalli, Gowthami, Bartoldson, Brian R., Kailkhura, Bhavya, Schwarzschild, Avi, Saha, Aniruddha, Goldblum, Micah, Geiping, Jonas, Goldstein, Tom
We show that language model finetuning can be improved, sometimes dramatically, with a simple augmentation. NEFTune adds noise to the embedding vectors during training. Standard finetuning of LLaMA-2-7B using Alpaca achieves 29.79% on AlpacaEval, whi
Externí odkaz:
http://arxiv.org/abs/2310.05914
Autor:
Jain, Neel, Schwarzschild, Avi, Wen, Yuxin, Somepalli, Gowthami, Kirchenbauer, John, Chiang, Ping-yeh, Goldblum, Micah, Saha, Aniruddha, Geiping, Jonas, Goldstein, Tom
As Large Language Models quickly become ubiquitous, it becomes critical to understand their security vulnerabilities. Recent work shows that text optimizers can produce jailbreaking prompts that bypass moderation and alignment. Drawing from the rich
Externí odkaz:
http://arxiv.org/abs/2309.00614
Autor:
Jain, Neel, Saifullah, Khalid, Wen, Yuxin, Kirchenbauer, John, Shu, Manli, Saha, Aniruddha, Goldblum, Micah, Geiping, Jonas, Goldstein, Tom
With the rise of Large Language Models (LLMs) and their ubiquitous deployment in diverse domains, measuring language model behavior on realistic data is imperative. For example, a company deploying a client-facing chatbot must ensure that the model w
Externí odkaz:
http://arxiv.org/abs/2306.13651
The strength of modern generative models lies in their ability to be controlled through text-based prompts. Typical "hard" prompts are made from interpretable words and tokens, and must be hand-crafted by humans. There are also "soft" prompts, which
Externí odkaz:
http://arxiv.org/abs/2302.03668
Let $G=(V,E)$ be a finite connected graph along with a coloring of the vertices of $G$ using the colors in a given set $X$. In this paper, we introduce multi-color forcing, a generalization of zero-forcing on graphs, and give conditions in which the
Externí odkaz:
http://arxiv.org/abs/1912.02001
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.