Zobrazeno 1 - 10
of 20
pro vyhledávání: '"Zemlyanskiy, Yury"'
Autor:
Zemlyanskiy, Yury, de Jong, Michiel, Vilnis, Luke, Ontañón, Santiago, Cohen, William W., Sanghai, Sumit, Ainslie, Joshua
Retrieval augmentation is a powerful but expensive method to make language models more knowledgeable about the world. Memory-based methods like LUMEN pre-compute token representations for retrieved passages to drastically speed up inference. However,
Externí odkaz:
http://arxiv.org/abs/2308.14903
Autor:
de Jong, Michiel, Zemlyanskiy, Yury, FitzGerald, Nicholas, Sanghai, Sumit, Cohen, William W., Ainslie, Joshua
Memory-augmentation is a powerful approach for efficiently incorporating external information into language models, but leads to reduced performance relative to retrieving text. Recent work introduced LUMEN, a memory-retrieval hybrid that partially p
Externí odkaz:
http://arxiv.org/abs/2306.10231
Autor:
Ainslie, Joshua, Lee-Thorp, James, de Jong, Michiel, Zemlyanskiy, Yury, Lebrón, Federico, Sanghai, Sumit
Multi-query attention (MQA), which only uses a single key-value head, drastically speeds up decoder inference. However, MQA can lead to quality degradation, and moreover it may not be desirable to train a separate model just for faster inference. We
Externí odkaz:
http://arxiv.org/abs/2305.13245
Autor:
Ainslie, Joshua, Lei, Tao, de Jong, Michiel, Ontañón, Santiago, Brahma, Siddhartha, Zemlyanskiy, Yury, Uthus, David, Guo, Mandy, Lee-Thorp, James, Tay, Yi, Sung, Yun-Hsuan, Sanghai, Sumit
Many natural language processing tasks benefit from long inputs, but processing long documents with Transformers is expensive -- not only due to quadratic attention complexity but also from applying feedforward and projection layers to every token. H
Externí odkaz:
http://arxiv.org/abs/2303.09752
Autor:
de Jong, Michiel, Zemlyanskiy, Yury, FitzGerald, Nicholas, Ainslie, Joshua, Sanghai, Sumit, Sha, Fei, Cohen, William
Retrieval-augmented language models such as Fusion-in-Decoder are powerful, setting the state of the art on a variety of knowledge-intensive tasks. However, they are also expensive, due to the need to encode a large number of retrieved passages. Some
Externí odkaz:
http://arxiv.org/abs/2301.10448
Autor:
de Jong, Michiel, Zemlyanskiy, Yury, Ainslie, Joshua, FitzGerald, Nicholas, Sanghai, Sumit, Sha, Fei, Cohen, William
Fusion-in-Decoder (FiD) is a powerful retrieval-augmented language model that sets the state-of-the-art on many knowledge-intensive NLP tasks. However, the architecture used for FiD was chosen by making minimal modifications to a standard T5 model, w
Externí odkaz:
http://arxiv.org/abs/2212.08153
Decoding methods for large language models often trade-off between diversity of outputs and parallelism of computation. Methods such as beam search and Gumbel top-k sampling can guarantee a different output for each element of the beam, but are not e
Externí odkaz:
http://arxiv.org/abs/2210.15458
Autor:
Zemlyanskiy, Yury, de Jong, Michiel, Ainslie, Joshua, Pasupat, Panupong, Shaw, Peter, Qiu, Linlu, Sanghai, Sumit, Sha, Fei
A common recent approach to semantic parsing augments sequence-to-sequence models by retrieving and appending a set of training samples, called exemplars. The effectiveness of this recipe is limited by the ability to retrieve informative exemplars th
Externí odkaz:
http://arxiv.org/abs/2209.14899
Natural language understanding tasks such as open-domain question answering often require retrieving and assimilating factual information from multiple sources. We propose to address this problem by integrating a semi-parametric representation of a l
Externí odkaz:
http://arxiv.org/abs/2110.06176
Knowledge-intensive tasks such as question answering often require assimilating information from different sections of large inputs such as books or article collections. We propose ReadTwice, a simple and effective technique that combines several str
Externí odkaz:
http://arxiv.org/abs/2105.04241