Zobrazeno 1 - 10
of 25
pro vyhledávání: '"Jumelet, Jaap"'
In English and other languages, multiple adjectives in a complex noun phrase show intricate ordering patterns that have been a target of much linguistic theory. These patterns offer an opportunity to assess the ability of language models (LMs) to lea
Externí odkaz:
http://arxiv.org/abs/2407.02136
The usual way to interpret language models (LMs) is to test their performance on different benchmarks and subsequently infer their internal processes. In this paper, we present an alternative approach, concentrating on the quality of LM processing, w
Externí odkaz:
http://arxiv.org/abs/2406.06441
We explore which linguistic factors -- at the sentence and token level -- play an important role in influencing language model predictions, and investigate whether these are reflective of results found in humans and human corpora (Gries and Kootstra,
Externí odkaz:
http://arxiv.org/abs/2406.04847
Autor:
Patil, Abhinav, Jumelet, Jaap, Chiu, Yu Ying, Lapastora, Andy, Shen, Peter, Wang, Lexie, Willrich, Clevis, Steinert-Threlkeld, Shane
This paper introduces Filtered Corpus Training, a method that trains language models (LMs) on corpora with certain linguistic constructions filtered out from the training data, and uses it to measure the ability of LMs to perform linguistic generaliz
Externí odkaz:
http://arxiv.org/abs/2405.15750
Language models are often used as the backbone of modern dialogue systems. These models are pre-trained on large amounts of written fluent language. Repetition is typically penalised when evaluating language model generations. However, it is a key co
Externí odkaz:
http://arxiv.org/abs/2311.13061
Autor:
Jumelet, Jaap, Zuidema, Willem
We present a setup for training, evaluating and interpreting neural language models, that uses artificial, language-like data. The data is generated using a massive probabilistic grammar (based on state-split PCFGs), that is itself derived from a lar
Externí odkaz:
http://arxiv.org/abs/2310.14840
Autor:
Jumelet, Jaap, Hanna, Michael, Kloots, Marianne de Heer, Langedijk, Anna, Pouw, Charlotte, van der Wal, Oskar
We present the submission of the ILLC at the University of Amsterdam to the BabyLM challenge (Warstadt et al., 2023), in the strict-small track. Our final model, ChapGTP, is a masked language model that was trained for 200 epochs, aided by a novel da
Externí odkaz:
http://arxiv.org/abs/2310.11282
In recent years, many interpretability methods have been proposed to help interpret the internal states of Transformer-models, at different levels of precision and complexity. Here, to analyze encoder-decoder Transformers, we propose a simple, new me
Externí odkaz:
http://arxiv.org/abs/2310.03686
Curriculum learning (CL) posits that machine learning models -- similar to humans -- may learn more efficiently from data that match their current learning progress. However, CL methods are still poorly understood and, in particular for natural langu
Externí odkaz:
http://arxiv.org/abs/2308.12202
Autor:
Jumelet, Jaap, Zuidema, Willem
We study feature interactions in the context of feature attribution methods for post-hoc interpretability. In interpretability research, getting to grips with feature interactions is increasingly recognised as an important challenge, because interact
Externí odkaz:
http://arxiv.org/abs/2306.12181