Zobrazeno 1 - 10
of 21
pro vyhledávání: '"Wies, Noam"'
Language model alignment has become an important component of AI safety, allowing safe interactions between humans and language models, by enhancing desired behaviors and inhibiting undesired ones. It is often done by tuning the model or inserting pr
Externí odkaz:
http://arxiv.org/abs/2401.16332
Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework
Autor:
Segev, Eliya, Alroy, Maya, Katsir, Ronen, Wies, Noam, Shenhav, Ayana, Ben-Oren, Yael, Zar, David, Tadmor, Oren, Bitterman, Jacob, Shashua, Amnon, Rosenwein, Tal
Connectionist Temporal Classification (CTC) is a widely used criterion for training supervised sequence-to-sequence (seq2seq) models. It enables learning the relations between input and output sequences, termed alignments, by marginalizing over perfe
Externí odkaz:
http://arxiv.org/abs/2307.01715
An important aspect in developing language models that interact with humans is aligning their behavior to be useful and unharmful for their human users. This is usually achieved by tuning the model in a way that enhances desired behaviors and inhibit
Externí odkaz:
http://arxiv.org/abs/2304.11082
In-context learning is a surprising and important phenomenon that emerged when modern language models were scaled to billions of learned parameters. Without modifying a large language model's weights, it can be tuned to perform various downstream nat
Externí odkaz:
http://arxiv.org/abs/2303.07895
The field of Natural Language Processing has experienced a dramatic leap in capabilities with the recent introduction of huge Language Models. Despite this success, natural language problems that involve several compounded steps are still practically
Externí odkaz:
http://arxiv.org/abs/2204.02892
Pretraining Neural Language Models (NLMs) over a large corpus involves chunking the text into training examples, which are contiguous text segments of sizes processable by the neural architecture. We highlight a bias introduced by this common practic
Externí odkaz:
http://arxiv.org/abs/2110.04541
After their successful debut in natural language processing, Transformer architectures are now becoming the de-facto standard in many domains. An obstacle for their deployment over new modalities is the architectural configuration: the optimal depth-
Externí odkaz:
http://arxiv.org/abs/2105.03928
Self-attention architectures, which are rapidly pushing the frontier in natural language processing, demonstrate a surprising depth-inefficient behavior: previous works indicate that increasing the internal representation (network width) is just as u
Externí odkaz:
http://arxiv.org/abs/2006.12467
Publikováno v:
Phys. Rev. Lett. 124, 020503 (2020)
Artificial Neural Networks were recently shown to be an efficient representation of highly-entangled many-body quantum states. In practical applications, neural-network states inherit numerical schemes used in Variational Monte Carlo, most notably th
Externí odkaz:
http://arxiv.org/abs/1902.04057
Publikováno v:
In Tensors for Data Processing 2022:215-248