Zobrazeno 1 - 10
of 163
pro vyhledávání: '"Rush, Alexander M"'
Autor:
Yin, Junjie Oscar, Rush, Alexander M.
Data selection can reduce the amount of training data needed to finetune LLMs; however, the efficacy of data selection scales directly with its compute. Motivated by the practical challenge of compute-constrained finetuning, we consider the setting i
Externí odkaz:
http://arxiv.org/abs/2410.16208
Autor:
Morris, John X., Rush, Alexander M.
Dense document embeddings are central to neural retrieval. The dominant paradigm is to train and construct embeddings by running encoders directly on individual documents. In this work, we argue that these embeddings, while effective, are implicitly
Externí odkaz:
http://arxiv.org/abs/2410.02525
Autor:
Lu, Yi, Yan, Jing Nathan, Yang, Songlin, Chiu, Justin T., Ren, Siyu, Yuan, Fei, Zhao, Wenting, Wu, Zhiyong, Rush, Alexander M.
Broad textual understanding and in-context learning require language models that utilize full document contexts. Due to the implementation challenges associated with directly training long-context models, many methods have been proposed for extending
Externí odkaz:
http://arxiv.org/abs/2409.12181
Linear RNN architectures, like Mamba, can be competitive with Transformer models in language modeling while having advantageous deployment characteristics. Given the focus on training large-scale Transformer models, we consider the challenge of conve
Externí odkaz:
http://arxiv.org/abs/2408.15237
$K$-nearest neighbor language models ($k$NN-LMs), which integrate retrieval with next-word prediction, have demonstrated strong performance in language modeling as well as downstream NLP benchmarks. These results have led researchers to argue that mo
Externí odkaz:
http://arxiv.org/abs/2408.11815
When seeking information from unfamiliar documents, users frequently pose questions that cannot be answered by the documents. While existing large language models (LLMs) identify these unanswerable questions, they do not assist users in reformulating
Externí odkaz:
http://arxiv.org/abs/2407.17469
Autor:
Akhauri, Yash, AbouElhamayed, Ahmed F, Dotzel, Jordan, Zhang, Zhiru, Rush, Alexander M, Huda, Safeen, Abdelfattah, Mohamed S
The high power consumption and latency-sensitive deployments of large language models (LLMs) have motivated efficiency techniques like quantization and sparsity. Contextual sparsity, where the sparsity pattern is input-dependent, is crucial in LLMs b
Externí odkaz:
http://arxiv.org/abs/2406.16635
Autor:
Wang, Junxiong, Mousavi, Ali, Attia, Omar, Pradeep, Ronak, Potdar, Saloni, Rush, Alexander M., Minhas, Umar Farooq, Li, Yunyao
Entity disambiguation (ED), which links the mentions of ambiguous entities to their referent entities in a knowledge base, serves as a core component in entity linking (EL). Existing generative approaches demonstrate improved accuracy compared to cla
Externí odkaz:
http://arxiv.org/abs/2404.01626
Token-free language models learn directly from raw bytes and remove the inductive bias of subword tokenization. Operating on bytes, however, results in significantly longer sequences. In this setting, standard autoregressive Transformers scale poorly
Externí odkaz:
http://arxiv.org/abs/2401.13660
Language models produce a distribution over the next token; can we use this information to recover the prompt tokens? We consider the problem of language model inversion and show that next-token probabilities contain a surprising amount of informatio
Externí odkaz:
http://arxiv.org/abs/2311.13647