Výsledky vyhledávání

Report

Tradeoffs Between Alignment and Helpfulness in Language Models with Representation Engineering

Autor: Wolf, Yotam, Wies, Noam, Shteyman, Dorin, Rothberg, Binyamin, Levine, Yoav, Shashua, Amnon

Language model alignment has become an important component of AI safety, allowing safe interactions between humans and language models, by enhancing desired behaviors and inhibiting undesired ones. It is often done by tuning the model or inserting pr

Externí odkaz: http://arxiv.org/abs/2401.16332

Zobrazit plný text záznamu

Report

Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework

Autor: Segev, Eliya, Alroy, Maya, Katsir, Ronen, Wies, Noam, Shenhav, Ayana, Ben-Oren, Yael, Zar, David, Tadmor, Oren, Bitterman, Jacob, Shashua, Amnon, Rosenwein, Tal

Connectionist Temporal Classification (CTC) is a widely used criterion for training supervised sequence-to-sequence (seq2seq) models. It enables learning the relations between input and output sequences, termed alignments, by marginalizing over perfe

Externí odkaz: http://arxiv.org/abs/2307.01715

Zobrazit plný text záznamu

Report

Fundamental Limitations of Alignment in Large Language Models

Autor: Wolf, Yotam, Wies, Noam, Avnery, Oshri, Levine, Yoav, Shashua, Amnon

An important aspect in developing language models that interact with humans is aligning their behavior to be useful and unharmful for their human users. This is usually achieved by tuning the model in a way that enhances desired behaviors and inhibit

Externí odkaz: http://arxiv.org/abs/2304.11082

Zobrazit plný text záznamu

Report

The Learnability of In-Context Learning

Autor: Wies, Noam, Levine, Yoav, Shashua, Amnon

In-context learning is a surprising and important phenomenon that emerged when modern language models were scaled to billions of learned parameters. Without modifying a large language model's weights, it can be tuned to perform various downstream nat

Externí odkaz: http://arxiv.org/abs/2303.07895

Zobrazit plný text záznamu

Report

Sub-Task Decomposition Enables Learning in Sequence to Sequence Tasks

Autor: Wies, Noam, Levine, Yoav, Shashua, Amnon

The field of Natural Language Processing has experienced a dramatic leap in capabilities with the recent introduction of huge Language Models. Despite this success, natural language problems that involve several compounded steps are still practically

Externí odkaz: http://arxiv.org/abs/2204.02892

Zobrazit plný text záznamu

Report

The Inductive Bias of In-Context Learning: Rethinking Pretraining Example Design

Autor: Levine, Yoav, Wies, Noam, Jannai, Daniel, Navon, Dan, Hoshen, Yedid, Shashua, Amnon

Pretraining Neural Language Models (NLMs) over a large corpus involves chunking the text into training examples, which are contiguous text segments of sizes processable by the neural architecture. We highlight a bias introduced by this common practic

Externí odkaz: http://arxiv.org/abs/2110.04541

Zobrazit plný text záznamu

Report

Which transformer architecture fits my data? A vocabulary bottleneck in self-attention

Autor: Wies, Noam, Levine, Yoav, Jannai, Daniel, Shashua, Amnon

After their successful debut in natural language processing, Transformer architectures are now becoming the de-facto standard in many domains. An obstacle for their deployment over new modalities is the architectural configuration: the optimal depth-

Externí odkaz: http://arxiv.org/abs/2105.03928

Zobrazit plný text záznamu

Report

The Depth-to-Width Interplay in Self-Attention

Autor: Levine, Yoav, Wies, Noam, Sharir, Or, Bata, Hofit, Shashua, Amnon

Self-attention architectures, which are rapidly pushing the frontier in natural language processing, demonstrate a surprising depth-inefficient behavior: previous works indicate that increasing the internal representation (network width) is just as u

Externí odkaz: http://arxiv.org/abs/2006.12467

Zobrazit plný text záznamu

Report

Deep autoregressive models for the efficient variational simulation of many-body quantum systems

Autor: Sharir, Or, Levine, Yoav, Wies, Noam, Carleo, Giuseppe, Shashua, Amnon

Publikováno v: Phys. Rev. Lett. 124, 020503 (2020)

Artificial Neural Networks were recently shown to be an efficient representation of highly-entangled many-body quantum states. In practical applications, neural-network states inherit numerical schemes used in Variational Monte Carlo, most notably th

Externí odkaz: http://arxiv.org/abs/1902.04057

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání