A dataset of questions on decision-theoretic reasoning in Newcomb-like problems

Autor:	Oesterheld, Caspar, Cooper, Emery, Kodama, Miles, Nguyen, Linh Chi, Perez, Ethan
Rok vydání:	2024
Předmět:	Computer Science - Computation and Language Computer Science - Artificial Intelligence I.2.7
Druh dokumentu:	Working Paper
Popis:	We introduce a dataset of natural-language questions in the decision theory of so-called Newcomb-like problems. Newcomb-like problems include, for instance, decision problems in which an agent interacts with a similar other agent, and thus has to reason about the fact that the other agent will likely reason in similar ways. Evaluating LLM reasoning about Newcomb-like problems is important because interactions between foundation-model-based agents will often be Newcomb-like. Some ways of reasoning about Newcomb-like problems may allow for greater cooperation between models. Our dataset contains both capabilities questions (i.e., questions with a unique, uncontroversially correct answer) and attitude questions (i.e., questions about which decision theorists would disagree). We use our dataset for an investigation of decision-theoretical capabilities and expressed attitudes and their interplay in existing models (different models by OpenAI, Anthropic, Meta, GDM, Reka, etc.), as well as models under simple prompt-based interventions. We find, among other things, that attitudes vary significantly between existing models; that high capabilities are associated with attitudes more favorable toward so-called evidential decision theory; and that attitudes are consistent across different types of questions. Comment: 48 pages, 15 figures; code and data at https://github.com/casparoe/newcomblike_questions_dataset
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2411.10588 Zobrazit plný text záznamu View this record from Arxiv