Výsledky vyhledávání - "Zemlyanskiy, Yury"

Report

MEMORY-VQ: Compression for Tractable Internet-Scale Memory

Autor: Zemlyanskiy, Yury, de Jong, Michiel, Vilnis, Luke, Ontañón, Santiago, Cohen, William W., Sanghai, Sumit, Ainslie, Joshua

Retrieval augmentation is a powerful but expensive method to make language models more knowledgeable about the world. Memory-based methods like LUMEN pre-compute token representations for retrieved passages to drastically speed up inference. However,

Externí odkaz: http://arxiv.org/abs/2308.14903

Zobrazit plný text záznamu

Report

GLIMMER: generalized late-interaction memory reranker

Autor: de Jong, Michiel, Zemlyanskiy, Yury, FitzGerald, Nicholas, Sanghai, Sumit, Cohen, William W., Ainslie, Joshua

Memory-augmentation is a powerful approach for efficiently incorporating external information into language models, but leads to reduced performance relative to retrieving text. Recent work introduced LUMEN, a memory-retrieval hybrid that partially p

Externí odkaz: http://arxiv.org/abs/2306.10231

Zobrazit plný text záznamu

Report

GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints

Autor: Ainslie, Joshua, Lee-Thorp, James, de Jong, Michiel, Zemlyanskiy, Yury, Lebrón, Federico, Sanghai, Sumit

Multi-query attention (MQA), which only uses a single key-value head, drastically speeds up decoder inference. However, MQA can lead to quality degradation, and moreover it may not be desirable to train a separate model just for faster inference. We

Externí odkaz: http://arxiv.org/abs/2305.13245

Zobrazit plný text záznamu

Report

CoLT5: Faster Long-Range Transformers with Conditional Computation

Autor: Ainslie, Joshua, Lei, Tao, de Jong, Michiel, Ontañón, Santiago, Brahma, Siddhartha, Zemlyanskiy, Yury, Uthus, David, Guo, Mandy, Lee-Thorp, James, Tay, Yi, Sung, Yun-Hsuan, Sanghai, Sumit

Many natural language processing tasks benefit from long inputs, but processing long documents with Transformers is expensive -- not only due to quadratic attention complexity but also from applying feedforward and projection layers to every token. H

Externí odkaz: http://arxiv.org/abs/2303.09752

Zobrazit plný text záznamu

Report

Pre-computed memory or on-the-fly encoding? A hybrid approach to retrieval augmentation makes the most of your compute

Autor: de Jong, Michiel, Zemlyanskiy, Yury, FitzGerald, Nicholas, Ainslie, Joshua, Sanghai, Sumit, Sha, Fei, Cohen, William

Retrieval-augmented language models such as Fusion-in-Decoder are powerful, setting the state of the art on a variety of knowledge-intensive tasks. However, they are also expensive, due to the need to encode a large number of retrieved passages. Some

Externí odkaz: http://arxiv.org/abs/2301.10448

Zobrazit plný text záznamu

Report

FiDO: Fusion-in-Decoder optimized for stronger performance and faster inference

Autor: de Jong, Michiel, Zemlyanskiy, Yury, Ainslie, Joshua, FitzGerald, Nicholas, Sanghai, Sumit, Sha, Fei, Cohen, William

Fusion-in-Decoder (FiD) is a powerful retrieval-augmented language model that sets the state-of-the-art on many knowledge-intensive NLP tasks. However, the architecture used for FiD was chosen by making minimal modifications to a standard T5 model, w

Externí odkaz: http://arxiv.org/abs/2212.08153

Zobrazit plný text záznamu

Report

Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models

Autor: Vilnis, Luke, Zemlyanskiy, Yury, Murray, Patrick, Passos, Alexandre, Sanghai, Sumit

Decoding methods for large language models often trade-off between diversity of outputs and parallelism of computation. Methods such as beam search and Gumbel top-k sampling can guarantee a different output for each element of the beam, but are not e

Externí odkaz: http://arxiv.org/abs/2210.15458

Zobrazit plný text záznamu

Report

Generate-and-Retrieve: use your predictions to improve retrieval for semantic parsing

Autor: Zemlyanskiy, Yury, de Jong, Michiel, Ainslie, Joshua, Pasupat, Panupong, Shaw, Peter, Qiu, Linlu, Sanghai, Sumit, Sha, Fei

A common recent approach to semantic parsing augments sequence-to-sequence models by retrieving and appending a set of training samples, called exemplars. The effectiveness of this recipe is limited by the ability to retrieve informative exemplars th

Externí odkaz: http://arxiv.org/abs/2209.14899

Zobrazit plný text záznamu

Report

Mention Memory: incorporating textual knowledge into Transformers through entity mention attention

Autor: de Jong, Michiel, Zemlyanskiy, Yury, FitzGerald, Nicholas, Sha, Fei, Cohen, William

Natural language understanding tasks such as open-domain question answering often require retrieving and assimilating factual information from multiple sources. We propose to address this problem by integrating a semi-parametric representation of a l

Externí odkaz: http://arxiv.org/abs/2110.06176

Zobrazit plný text záznamu

Report

ReadTwice: Reading Very Large Documents with Memories

Autor: Zemlyanskiy, Yury, Ainslie, Joshua, de Jong, Michiel, Pham, Philip, Eckstein, Ilya, Sha, Fei

Knowledge-intensive tasks such as question answering often require assimilating information from different sections of large inputs such as books or article collections. We propose ReadTwice, a simple and effective technique that combines several str

Externí odkaz: http://arxiv.org/abs/2105.04241

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání