Zobrazeno 1 - 10
of 23
pro vyhledávání: '"Petty, Jackson"'
Large language models are increasingly trained on corpora containing both natural language and non-linguistic data like source code. Aside from aiding programming-related tasks, anecdotal evidence suggests that including code in pretraining corpora m
Externí odkaz:
http://arxiv.org/abs/2409.04556
State-space models (SSMs) have emerged as a potential alternative architecture for building large language models (LLMs) compared to the previously ubiquitous transformer architecture. One theoretical weakness of transformers is that they cannot expr
Externí odkaz:
http://arxiv.org/abs/2404.08819
Autor:
Rein, David, Hou, Betty Li, Stickland, Asa Cooper, Petty, Jackson, Pang, Richard Yuanzhe, Dirani, Julien, Michael, Julian, Bowman, Samuel R.
We present GPQA, a challenging dataset of 448 multiple-choice questions written by domain experts in biology, physics, and chemistry. We ensure that the questions are high-quality and extremely difficult: experts who have or are pursuing PhDs in the
Externí odkaz:
http://arxiv.org/abs/2311.12022
Autor:
Michael, Julian, Mahdi, Salsabila, Rein, David, Petty, Jackson, Dirani, Julien, Padmakumar, Vishakh, Bowman, Samuel R.
As AI systems are used to answer more difficult questions and potentially help create new knowledge, judging the truthfulness of their outputs becomes more difficult and more important. How can we supervise unreliable experts, which have access to th
Externí odkaz:
http://arxiv.org/abs/2311.08702
In-context learning (ICL) is now a common method for teaching large language models (LLMs) new tasks: given labeled examples in the input context, the LLM learns to perform the task without weight updates. Do models guided via ICL infer the underlyin
Externí odkaz:
http://arxiv.org/abs/2311.07811
Language models are typically evaluated on their success at predicting the distribution of specific words in specific contexts. Yet linguistic knowledge also encodes relationships between contexts, allowing inferences between word distributions. We i
Externí odkaz:
http://arxiv.org/abs/2311.04900
Autor:
Petty, Jackson, van Steenkiste, Sjoerd, Dasgupta, Ishita, Sha, Fei, Garrette, Dan, Linzen, Tal
To process novel sentences, language models (LMs) must generalize compositionally -- combine familiar elements in new ways. What aspects of a model's structure promote compositional generalization? Focusing on transformers, we test the hypothesis, mo
Externí odkaz:
http://arxiv.org/abs/2310.19956
Naturally occurring information-seeking questions often contain questionable assumptions -- assumptions that are false or unverifiable. Questions containing questionable assumptions are challenging because they require a distinct answer strategy that
Externí odkaz:
http://arxiv.org/abs/2212.10003
How is knowledge of position-role mappings in natural language learned? We explore this question in a computational setting, testing whether a variety of well-performing pertained language models (BERT, RoBERTa, and DistilBERT) exhibit knowledge of t
Externí odkaz:
http://arxiv.org/abs/2202.03611
Autor:
Petty, Jackson, Frank, Robert
Natural language exhibits patterns of hierarchically governed dependencies, in which relations between words are sensitive to syntactic structure rather than linear ordering. While re-current network models often fail to generalize in a hierarchicall
Externí odkaz:
http://arxiv.org/abs/2109.12036