Zobrazeno 1 - 10
of 11 645
pro vyhledávání: '"Rush, P."'
Autor:
Schiff, Yair, Sahoo, Subham Sekhar, Phung, Hao, Wang, Guanghan, Boshar, Sam, Dalla-torre, Hugo, de Almeida, Bernardo P., Rush, Alexander, Pierrot, Thomas, Kuleshov, Volodymyr
Diffusion models for continuous data gained widespread adoption owing to their high quality generation and control mechanisms. However, controllable diffusion on discrete data faces challenges given that continuous guidance methods do not directly ap
Externí odkaz:
http://arxiv.org/abs/2412.10193
Open community-driven platforms like Chatbot Arena that collect user preference data from site visitors have gained a reputation as one of the most trustworthy publicly available benchmarks for LLM performance. While now standard, it is tricky to imp
Externí odkaz:
http://arxiv.org/abs/2412.04363
Autor:
Zhao, Wenting, Jiang, Nan, Lee, Celine, Chiu, Justin T, Cardie, Claire, Gallé, Matthias, Rush, Alexander M
With the goal of benchmarking generative systems beyond expert software development ability, we introduce Commit0, a benchmark that challenges AI agents to write libraries from scratch. Agents are provided with a specification document outlining the
Externí odkaz:
http://arxiv.org/abs/2412.01769
Autor:
Kaushik, Abhishek, Rush, Kayla
Music is a potent form of expression that can communicate, accentuate or even create the emotions of an individual or a collective. Both historically and in contemporary experiences, musical expression was and is commonly instrumentalized for social,
Externí odkaz:
http://arxiv.org/abs/2411.06420
Autor:
Yin, Junjie Oscar, Rush, Alexander M.
Data selection can reduce the amount of training data needed to finetune LLMs; however, the efficacy of data selection scales directly with its compute. Motivated by the practical challenge of compute-constrained finetuning, we consider the setting i
Externí odkaz:
http://arxiv.org/abs/2410.16208
Autor:
Morris, John X., Rush, Alexander M.
Dense document embeddings are central to neural retrieval. The dominant paradigm is to train and construct embeddings by running encoders directly on individual documents. In this work, we argue that these embeddings, while effective, are implicitly
Externí odkaz:
http://arxiv.org/abs/2410.02525
Autor:
Singh, Vikash, Khanzadeh, Matthew, Davis, Vincent, Rush, Harrison, Rossi, Emanuele, Shrader, Jesse, Lio, Pietro
We present Bayesian Binary Search (BBS), a novel probabilistic variant of the classical binary search/bisection algorithm. BBS leverages machine learning/statistical techniques to estimate the probability density of the search space and modifies the
Externí odkaz:
http://arxiv.org/abs/2410.01771
Autor:
Lu, Yi, Yan, Jing Nathan, Yang, Songlin, Chiu, Justin T., Ren, Siyu, Yuan, Fei, Zhao, Wenting, Wu, Zhiyong, Rush, Alexander M.
Broad textual understanding and in-context learning require language models that utilize full document contexts. Due to the implementation challenges associated with directly training long-context models, many methods have been proposed for extending
Externí odkaz:
http://arxiv.org/abs/2409.12181
Linear RNN architectures, like Mamba, can be competitive with Transformer models in language modeling while having advantageous deployment characteristics. Given the focus on training large-scale Transformer models, we consider the challenge of conve
Externí odkaz:
http://arxiv.org/abs/2408.15237
$K$-nearest neighbor language models ($k$NN-LMs), which integrate retrieval with next-word prediction, have demonstrated strong performance in language modeling as well as downstream NLP benchmarks. These results have led researchers to argue that mo
Externí odkaz:
http://arxiv.org/abs/2408.11815