Zobrazeno 1 - 10
of 1 098
pro vyhledávání: '"A P Mahowald"'
Math is constructed by people for people: just as natural language corpora reflect not just propositions but the communicative goals of language users, the math data that models are trained on reflects not just idealized mathematical entities but ric
Externí odkaz:
http://arxiv.org/abs/2409.17005
Autor:
Sprague, Zayne, Yin, Fangcong, Rodriguez, Juan Diego, Jiang, Dongwei, Wadhwa, Manya, Singhal, Prasann, Zhao, Xinyu, Ye, Xi, Mahowald, Kyle, Durrett, Greg
Chain-of-thought (CoT) via prompting is the de facto method for eliciting reasoning capabilities from large language models (LLMs). But for what kinds of tasks is this extra ``thinking'' really helpful? To analyze this, we conducted a quantitative me
Externí odkaz:
http://arxiv.org/abs/2409.12183
The variations between in-group and out-group speech (intergroup bias) are subtle and could underlie many social phenomena like stereotype perpetuation and implicit bias. In this paper, we model the intergroup bias as a tagging task on English sports
Externí odkaz:
http://arxiv.org/abs/2406.17947
English allows for both compounds (e.g., London-made) and phrasal paraphrases (e.g., made in London). While these constructions have roughly the same truth-conditional meaning, we hypothesize that the compound allows less freedom to express the natur
Externí odkaz:
http://arxiv.org/abs/2405.10457
Autor:
Misra, Kanishka, Mahowald, Kyle
Language models learn rare syntactic phenomena, but the extent to which this is attributable to generalization vs. memorization is a major open question. To that end, we iteratively trained transformer language models on systematically manipulated co
Externí odkaz:
http://arxiv.org/abs/2403.19827
Publikováno v:
Proceedings of the National Academy of Sciences, 121(36), e2400917121 (2024)
Do large language models (LLMs) make human-like linguistic generalizations? Dentella et al. (2023) ("DGL") prompt several LLMs ("Is the following sentence grammatically correct in English?") to elicit grammaticality judgments of 80 English sentences,
Externí odkaz:
http://arxiv.org/abs/2402.01676
Recent zero-shot evaluations have highlighted important limitations in the abilities of language models (LMs) to perform meaning extraction. However, it is now well known that LMs can demonstrate radical improvements in the presence of experimental c
Externí odkaz:
http://arxiv.org/abs/2401.06640
Chomsky and others have very directly claimed that large language models (LLMs) are equally capable of learning languages that are possible and impossible for humans to learn. However, there is very little published experimental evidence to support s
Externí odkaz:
http://arxiv.org/abs/2401.06416
Autor:
Lederman, Harvey, Mahowald, Kyle
Are LLMs cultural technologies like photocopiers or printing presses, which transmit information but cannot create new content? A challenge for this idea, which we call bibliotechnism, is that LLMs generate novel text. We begin with a defense of bibl
Externí odkaz:
http://arxiv.org/abs/2401.04854
Zipf (1935) posited that wordforms are optimized to minimize utterances' communicative costs. Under the assumption that cost is given by an utterance's length, he supported this claim by showing that words' lengths are inversely correlated with their
Externí odkaz:
http://arxiv.org/abs/2312.03897