Zobrazeno 1 - 10
of 257
pro vyhledávání: '"Mccoy, R. P."'
In "Embers of Autoregression" (McCoy et al., 2023), we showed that several large language models (LLMs) have some important limitations that are attributable to their origins in next-word prediction. Here we investigate whether these issues persist w
Externí odkaz:
http://arxiv.org/abs/2410.01792
Chain-of-Thought (CoT) prompting has been shown to enhance the multi-step reasoning capabilities of Large Language Models (LLMs). However, debates persist about whether LLMs exhibit abstract generalization or rely on shallow heuristics when given CoT
Externí odkaz:
http://arxiv.org/abs/2407.01687
Large language models (LLMs) have shown the emergent capability of in-context learning (ICL). One line of research has explained ICL as functionally performing gradient descent. In this paper, we introduce a new way of diagnosing whether ICL is funct
Externí odkaz:
http://arxiv.org/abs/2406.18501
Autor:
Chi, Nathan A., Malchev, Teodor, Kong, Riley, Chi, Ryan A., Huang, Lucas, Chi, Ethan A., McCoy, R. Thomas, Radev, Dragomir
We introduce modeLing, a novel benchmark of Linguistics Olympiad-style puzzles which tests few-shot reasoning in AI systems. Solving these puzzles necessitates inferring aspects of a language's grammatical structure from a small number of examples. S
Externí odkaz:
http://arxiv.org/abs/2406.17038
Autor:
Becker, McCoy R., Lew, Alexander K., Wang, Xiaoyan, Ghavami, Matin, Huot, Mathieu, Rinard, Martin C., Mansinghka, Vikash K.
Publikováno v:
PLDI 2024
Compared to the wide array of advanced Monte Carlo methods supported by modern probabilistic programming languages (PPLs), PPL support for variational inference (VI) is less developed: users are typically limited to a predefined selection of variatio
Externí odkaz:
http://arxiv.org/abs/2406.15742
Humans can learn new concepts from a small number of examples by drawing on their inductive biases. These inductive biases have previously been captured by using Bayesian models defined over symbolic hypothesis spaces. Is it possible to create a neur
Externí odkaz:
http://arxiv.org/abs/2402.07035
Large language models (LLMs) can produce long, coherent passages of text, suggesting that LLMs, although trained on next-word prediction, must represent the latent structure that characterizes a document. Prior work has found that internal representa
Externí odkaz:
http://arxiv.org/abs/2312.14226
The success of methods based on artificial neural networks in creating intelligent machines seems like it might pose a challenge to explanations of human cognition in terms of Bayesian inference. We argue that this is not the case, and that in fact t
Externí odkaz:
http://arxiv.org/abs/2311.10206
The widespread adoption of large language models (LLMs) makes it important to recognize their strengths and limitations. We argue that in order to develop a holistic understanding of these systems we need to consider the problem that they were traine
Externí odkaz:
http://arxiv.org/abs/2309.13638
Autor:
McCoy, R. Thomas, Griffiths, Thomas L.
Humans can learn languages from remarkably little experience. Developing computational models that explain this ability has been a major challenge in cognitive science. Bayesian models that build in strong inductive biases - factors that guide genera
Externí odkaz:
http://arxiv.org/abs/2305.14701