Zobrazeno 1 - 10
of 22
pro vyhledávání: '"Snæbjarnarson, Vésteinn"'
Understanding and manipulating the causal generation mechanisms in language models is essential for controlling their behavior. Previous work has primarily relied on techniques such as representation surgery -- e.g., model ablations or manipulation o
Externí odkaz:
http://arxiv.org/abs/2411.07180
Autor:
Stoehr, Niklas, Du, Kevin, Snæbjarnarson, Vésteinn, West, Robert, Cotterell, Ryan, Schein, Aaron
Given the prompt "Rome is in", can we steer a language model to flip its prediction of an incorrect token "France" to a correct token "Italy" by only multiplying a few relevant activation vectors with scalars? We argue that successfully intervening o
Externí odkaz:
http://arxiv.org/abs/2410.04962
Autor:
Loeschcke, Sebastian, Toftrup, Mads, Kastoryano, Michael J., Belongie, Serge, Snæbjarnarson, Vésteinn
Despite advances using low-rank adapters and quantization, pretraining of large models on consumer hardware has not been possible without model sharding, offloading during training, or per-layer gradient updates. To address these limitations, we prop
Externí odkaz:
http://arxiv.org/abs/2405.16528
Autor:
Du, Kevin, Snæbjarnarson, Vésteinn, Stoehr, Niklas, White, Jennifer C., Schein, Aaron, Cotterell, Ryan
To answer a question, language models often need to integrate prior knowledge learned during pretraining and new information presented in context. We hypothesize that models perform this integration in a predictable way across different questions and
Externí odkaz:
http://arxiv.org/abs/2404.04633
Autor:
Ingólfsdóttir, Svanhvít Lilja, Ragnarsson, Pétur Orri, Jónsson, Haukur Páll, Símonarson, Haukur Barri, Þorsteinsson, Vilhjálmur, Snæbjarnarson, Vésteinn
Grammatical error correction (GEC) is the task of correcting typos, spelling, punctuation and grammatical issues in text. Approaching the problem as a sequence-to-sequence task, we compare the use of a common subword unit vocabulary and byte-level en
Externí odkaz:
http://arxiv.org/abs/2305.17906
Multilingual language models have pushed state-of-the-art in cross-lingual NLP transfer. The majority of zero-shot cross-lingual transfer, however, use one and the same massively multilingual transformer (e.g., mBERT or XLM-R) to transfer to all targ
Externí odkaz:
http://arxiv.org/abs/2304.08823
Autor:
Schwartz, Idan, Snæbjarnarson, Vésteinn, Chefer, Hila, Cotterell, Ryan, Belongie, Serge, Wolf, Lior, Benaim, Sagie
Recent advances in text-to-image diffusion models have enabled the generation of diverse and high-quality images. While impressive, the images often fall short of depicting subtle details and are susceptible to errors due to ambiguity in the input te
Externí odkaz:
http://arxiv.org/abs/2303.17155
Autor:
Christensen, Peter Ebert, Snæbjarnarson, Vésteinn, Dittadi, Andrea, Belongie, Serge, Benaim, Sagie
The robustness of image classifiers is essential to their deployment in the real world. The ability to assess this resilience to manipulations or deviations from the training data is thus crucial. These modifications have traditionally consisted of m
Externí odkaz:
http://arxiv.org/abs/2211.09782
It can be challenging to build effective open question answering (open QA) systems for languages other than English, mainly due to a lack of labeled data for training. We present a data efficient method to bootstrap such a system for languages other
Externí odkaz:
http://arxiv.org/abs/2207.01918
Autor:
Snæbjarnarson, Vésteinn, Símonarson, Haukur Barri, Ragnarsson, Pétur Orri, Ingólfsdóttir, Svanhvít Lilja, Jónsson, Haukur Páll, Þorsteinsson, Vilhjálmur, Einarsson, Hafsteinn
We train several language models for Icelandic, including IceBERT, that achieve state-of-the-art performance in a variety of downstream tasks, including part-of-speech tagging, named entity recognition, grammatical error detection and constituency pa
Externí odkaz:
http://arxiv.org/abs/2201.05601