Zobrazeno 1 - 9
of 9
pro vyhledávání: '"Patel, Arkil"'
Recent work has developed optimization procedures to find token sequences, called adversarial triggers, which can elicit unsafe responses from aligned language models. These triggers are believed to be universally transferable, i.e., a trigger optimi
Externí odkaz:
http://arxiv.org/abs/2404.16020
Contemporary Large Language Models (LLMs) exhibit a high degree of code generation and comprehension capability. A particularly promising area is their ability to interpret code modules from unfamiliar libraries for solving user-instructed tasks. Rec
Externí odkaz:
http://arxiv.org/abs/2311.09635
Humans possess a remarkable ability to assign novel interpretations to linguistic expressions, enabling them to learn new words and understand community-specific connotations. However, Large Language Models (LLMs) have a knowledge cutoff and are cost
Externí odkaz:
http://arxiv.org/abs/2310.11634
In order to understand the in-context learning phenomenon, recent works have adopted a stylized experimental framework and demonstrated that Transformers can learn gradient-based learning algorithms for various classes of real-valued functions. Howev
Externí odkaz:
http://arxiv.org/abs/2310.03016
Despite the widespread success of Transformers on NLP tasks, recent works have found that they struggle to model several formal languages when compared to recurrent models. This raises the question of why Transformers perform well in practice and whe
Externí odkaz:
http://arxiv.org/abs/2211.12316
Humans can reason compositionally whilst grounding language utterances to the real world. Recent benchmarks like ReaSCAN use navigation tasks grounded in a grid world to assess whether neural models exhibit similar capabilities. In this work, we pres
Externí odkaz:
http://arxiv.org/abs/2210.12786
Compositional generalization is a fundamental trait in humans, allowing us to effortlessly combine known phrases to form novel sentences. Recent works have claimed that standard seq-to-seq models severely lack the ability to compositionally generaliz
Externí odkaz:
http://arxiv.org/abs/2203.07402
The problem of designing NLP solvers for math word problems (MWP) has seen sustained research activity and steady gains in the test accuracy. Since existing solvers achieve high performance on the benchmark datasets for elementary level MWPs containi
Externí odkaz:
http://arxiv.org/abs/2103.07191
Transformers are being used extensively across several sequence modeling tasks. Significant research effort has been devoted to experimentally probe the inner workings of Transformers. However, our conceptual and theoretical understanding of their po
Externí odkaz:
http://arxiv.org/abs/2006.09286