Zobrazeno 1 - 10
of 41
pro vyhledávání: '"van Schijndel, Marten"'
In this work, we use language modeling to investigate the factors that influence code-switching. Code-switching occurs when a speaker alternates between one language variety (the primary language) and another (the secondary language), and is widely o
Externí odkaz:
http://arxiv.org/abs/2408.04596
Pretrained language model (PLM) hidden states are frequently employed as contextual word embeddings (CWE): high-dimensional representations that encode semantic information given linguistic context. Across many areas of computational linguistics rese
Externí odkaz:
http://arxiv.org/abs/2408.04162
Previous work has shown that isolated non-canonical sentences with Object-before-Subject (OSV) order are initially harder to process than their canonical counterparts with Subject-before-Object (SOV) order. Although this difficulty diminishes with ap
Externí odkaz:
http://arxiv.org/abs/2405.07730
We test the hypothesis that discourse predictability influences Hindi syntactic choice. While prior work has shown that a number of factors (e.g., information status, dependency length, and syntactic surprisal) influence Hindi word order preferences,
Externí odkaz:
http://arxiv.org/abs/2210.13940
Word order choices during sentence production can be primed by preceding sentences. In this work, we test the DUAL MECHANISM hypothesis that priming is driven by multiple different sources. Using a Hindi corpus of text productions, we model lexical p
Externí odkaz:
http://arxiv.org/abs/2210.13938
Autor:
Timkey, William, van Schijndel, Marten
Similarity measures are a vital tool for understanding how language models represent and process language. Standard representational similarity measures such as cosine similarity and Euclidean distance have been successfully used in static word embed
Externí odkaz:
http://arxiv.org/abs/2109.04404
Abstractive neural summarization models have seen great improvements in recent years, as shown by ROUGE scores of the generated summaries. But despite these improved metrics, there is limited understanding of the strategies different models employ, a
Externí odkaz:
http://arxiv.org/abs/2106.01581
Autor:
Davis, Forrest, van Schijndel, Marten
A growing body of literature has focused on detailing the linguistic knowledge embedded in large, pretrained language models. Existing work has shown that non-linguistic biases in models can drive model behavior away from linguistic generalizations.
Externí odkaz:
http://arxiv.org/abs/2106.01207
Autor:
Davis, Forrest, van Schijndel, Marten
Language models (LMs) trained on large quantities of text have been claimed to acquire abstract linguistic representations. Our work tests the robustness of these abstractions by focusing on the ability of LMs to learn interactions between different
Externí odkaz:
http://arxiv.org/abs/2010.04887
Autor:
Davis, Forrest, van Schijndel, Marten
A standard approach to evaluating language models analyzes how models assign probabilities to valid versus invalid syntactic constructions (i.e. is a grammatical sentence more probable than an ungrammatical sentence). Our work uses ambiguous relative
Externí odkaz:
http://arxiv.org/abs/2005.00165