Výsledky vyhledávání

Report

Simple Guidance Mechanisms for Discrete Diffusion Models

Autor: Schiff, Yair, Sahoo, Subham Sekhar, Phung, Hao, Wang, Guanghan, Boshar, Sam, Dalla-torre, Hugo, de Almeida, Bernardo P., Rush, Alexander, Pierrot, Thomas, Kuleshov, Volodymyr

Diffusion models for continuous data gained widespread adoption owing to their high quality generation and control mechanisms. However, controllable diffusion on discrete data faces challenges given that continuous guidance methods do not directly ap

Externí odkaz: http://arxiv.org/abs/2412.10193

Zobrazit plný text záznamu

Report

Challenges in Trustworthy Human Evaluation of Chatbots

Autor: Zhao, Wenting, Rush, Alexander M., Goyal, Tanya

Open community-driven platforms like Chatbot Arena that collect user preference data from site visitors have gained a reputation as one of the most trustworthy publicly available benchmarks for LLM performance. While now standard, it is tricky to imp

Externí odkaz: http://arxiv.org/abs/2412.04363

Zobrazit plný text záznamu

Report

Commit0: Library Generation from Scratch

Autor: Zhao, Wenting, Jiang, Nan, Lee, Celine, Chiu, Justin T, Cardie, Claire, Gallé, Matthias, Rush, Alexander M

With the goal of benchmarking generative systems beyond expert software development ability, we introduce Commit0, a benchmark that challenges AI agents to write libraries from scratch. Agents are provided with a specification document outlining the

Externí odkaz: http://arxiv.org/abs/2412.01769

Zobrazit plný text záznamu

Report

Generating Mixcode Popular Songs with Artificial Intelligence: Concepts, Plans, and Speculations

Autor: Kaushik, Abhishek, Rush, Kayla

Music is a potent form of expression that can communicate, accentuate or even create the emotions of an individual or a collective. Both historically and in contemporary experiences, musical expression was and is commonly instrumentalized for social,

Externí odkaz: http://arxiv.org/abs/2411.06420

Zobrazit plný text záznamu

Report

Compute-Constrained Data Selection

Autor: Yin, Junjie Oscar, Rush, Alexander M.

Data selection can reduce the amount of training data needed to finetune LLMs; however, the efficacy of data selection scales directly with its compute. Motivated by the practical challenge of compute-constrained finetuning, we consider the setting i

Externí odkaz: http://arxiv.org/abs/2410.16208

Zobrazit plný text záznamu

Report

Contextual Document Embeddings

Autor: Morris, John X., Rush, Alexander M.

Dense document embeddings are central to neural retrieval. The dominant paradigm is to train and construct embeddings by running encoders directly on individual documents. In this work, we argue that these embeddings, while effective, are implicitly

Externí odkaz: http://arxiv.org/abs/2410.02525

Zobrazit plný text záznamu

Report

Bayesian Binary Search

Autor: Singh, Vikash, Khanzadeh, Matthew, Davis, Vincent, Rush, Harrison, Rossi, Emanuele, Shrader, Jesse, Lio, Pietro

We present Bayesian Binary Search (BBS), a novel probabilistic variant of the classical binary search/bisection algorithm. BBS leverages machine learning/statistical techniques to estimate the probability density of the search space and modifies the

Externí odkaz: http://arxiv.org/abs/2410.01771

Zobrazit plný text záznamu

Report

A Controlled Study on Long Context Extension and Generalization in LLMs

Autor: Lu, Yi, Yan, Jing Nathan, Yang, Songlin, Chiu, Justin T., Ren, Siyu, Yuan, Fei, Zhao, Wenting, Wu, Zhiyong, Rush, Alexander M.

Broad textual understanding and in-context learning require language models that utilize full document contexts. Due to the implementation challenges associated with directly training long-context models, many methods have been proposed for extending

Externí odkaz: http://arxiv.org/abs/2409.12181

Zobrazit plný text záznamu

Report

The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Autor: Wang, Junxiong, Paliotta, Daniele, May, Avner, Rush, Alexander M., Dao, Tri

Linear RNN architectures, like Mamba, can be competitive with Transformer models in language modeling while having advantageous deployment characteristics. Given the focus on training large-scale Transformer models, we consider the challenge of conve

Externí odkaz: http://arxiv.org/abs/2408.15237

Zobrazit plný text záznamu

Report

Great Memory, Shallow Reasoning: Limits of $k$NN-LMs

Autor: Geng, Shangyi, Zhao, Wenting, Rush, Alexander M

$K$-nearest neighbor language models ($k$NN-LMs), which integrate retrieval with next-word prediction, have demonstrated strong performance in language modeling as well as downstream NLP benchmarks. These results have led researchers to argue that mo

Externí odkaz: http://arxiv.org/abs/2408.11815

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání