Zobrazeno 1 - 10
of 111
pro vyhledávání: '"Jacobs, Cassandra L."'
In this work we compare the generative behavior at the next token prediction level in several language models by comparing them to human productions in the cloze task. We find that while large models trained for longer are typically better estimators
Externí odkaz:
http://arxiv.org/abs/2410.12057
Annotation of discourse relations is a known difficult task, especially for non-expert annotators. In this paper, we investigate novice annotators' uncertainty on the annotation of discourse relations on spoken conversational data. We find that dialo
Externí odkaz:
http://arxiv.org/abs/2308.07179
Time pressure and topic negotiation may impose constraints on how people leverage discourse relations (DRs) in spontaneous conversational contexts. In this work, we adapt a system of DRs for written language to spontaneous dialogue using crowdsourced
Externí odkaz:
http://arxiv.org/abs/2307.03645
Autor:
Jacobs, Cassandra L., Pinter, Yuval
We look at a decision taken early in training a subword tokenizer, namely whether it should be the word-initial token that carries a special mark, or the word-final one. Based on surface-level considerations of efficiency and cohesion, as well as mor
Externí odkaz:
http://arxiv.org/abs/2208.01561
Natural language processing systems often struggle with out-of-vocabulary (OOV) terms, which do not appear in training data. Blends, such as "innoventor", are one particularly challenging class of OOV, as they are formed by fusing together two or mor
Externí odkaz:
http://arxiv.org/abs/2009.09123
We present the New York Times Word Innovation Types dataset, or NYTWIT, a collection of over 2,500 novel English words published in the New York Times between November 2017 and March 2019, manually annotated for their class of novelty (such as lexica
Externí odkaz:
http://arxiv.org/abs/2003.03444
Publikováno v:
In Cognition January 2023 230
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Publikováno v:
In Journal of Memory and Language February 2020 110