Výsledky vyhledávání - "Roux, Nicolás"

Report

fPLSA: Learning Semantic Structures in Document Collections Using Foundation Models

Autor: Xu, Weijia, Jojic, Nebojsa, Roux, Nicolas Le

Humans have the ability to learn new tasks by inferring high-level concepts from existing solution, then manipulating these concepts in lieu of the raw data. Can we automate this process by deriving latent semantic structures in a document collection

Externí odkaz: http://arxiv.org/abs/2410.05481

Zobrazit plný text záznamu

Report

VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment

Autor: Kazemnejad, Amirhossein, Aghajohari, Milad, Portelance, Eva, Sordoni, Alessandro, Reddy, Siva, Courville, Aaron, Roux, Nicolas Le

Large language models (LLMs) are increasingly applied to complex reasoning tasks that require executing several complex steps before receiving any reward. Properly assigning credit to these steps is essential for enhancing model performance. Proximal

Externí odkaz: http://arxiv.org/abs/2410.01679

Zobrazit plný text záznamu

Report

Improving Context-Aware Preference Modeling for Language Models

Autor: Pitis, Silviu, Xiao, Ziang, Roux, Nicolas Le, Sordoni, Alessandro

While finetuning language models from pairwise preferences has proven remarkably effective, the underspecified nature of natural language presents critical challenges. Direct preference feedback is uninterpretable, difficult to provide where multidim

Externí odkaz: http://arxiv.org/abs/2407.14916

Zobrazit plný text záznamu

Report

Towards Modular LLMs by Building and Reusing a Library of LoRAs

Autor: Ostapenko, Oleksiy, Su, Zhan, Ponti, Edoardo Maria, Charlin, Laurent, Roux, Nicolas Le, Pereira, Matheus, Caccia, Lucas, Sordoni, Alessandro

The growing number of parameter-efficient adaptations of a base large language model (LLM) calls for studying whether we can reuse such trained adapters to improve performance for new tasks. We study how to best build a library of adapters given mult

Externí odkaz: http://arxiv.org/abs/2405.11157

Zobrazit plný text záznamu

Report

Language-guided Skill Learning with Temporal Variational Inference

Autor: Fu, Haotian, Sharma, Pratyusha, Stengel-Eskin, Elias, Konidaris, George, Roux, Nicolas Le, Côté, Marc-Alexandre, Yuan, Xingdi

We present an algorithm for skill discovery from expert demonstrations. The algorithm first utilizes Large Language Models (LLMs) to propose an initial segmentation of the trajectories. Following that, a hierarchical variational inference framework i

Externí odkaz: http://arxiv.org/abs/2402.16354

Zobrazit plný text záznamu

Report

Joint Prompt Optimization of Stacked LLMs using Variational Inference

Autor: Sordoni, Alessandro, Yuan, Xingdi, Côté, Marc-Alexandre, Pereira, Matheus, Trischler, Adam, Xiao, Ziang, Hosseini, Arian, Niedtner, Friederike, Roux, Nicolas Le

Large language models (LLMs) can be seen as atomic units of computation mapping sequences to a distribution over sequences. Thus, they can be seen as stochastic language layers in a language network, where the learnable parameters are the natural lan

Externí odkaz: http://arxiv.org/abs/2306.12509

Zobrazit plný text záznamu

Report

Unraveling the Interconnected Axes of Heterogeneity in Machine Learning for Democratic and Inclusive Advancements

Autor: Molamohammadi, Maryam, Taik, Afaf, Roux, Nicolas Le, Farnadi, Golnoosh

The growing utilization of machine learning (ML) in decision-making processes raises questions about its benefits to society. In this study, we identify and analyze three axes of heterogeneity that significantly influence the trajectory of ML product

Externí odkaz: http://arxiv.org/abs/2306.10043

Zobrazit plný text záznamu

Report

Decision-Aware Actor-Critic with Function Approximation and Theoretical Guarantees

Autor: Vaswani, Sharan, Kazemi, Amirreza, Babanezhad, Reza, Roux, Nicolas Le

Actor-critic (AC) methods are widely used in reinforcement learning (RL) and benefit from the flexibility of using any policy gradient method as the actor and value-based method as the critic. The critic is usually trained by minimizing the TD error,

Externí odkaz: http://arxiv.org/abs/2305.15249

Zobrazit plný text záznamu

Report

Target-based Surrogates for Stochastic Optimization

Autor: Lavington, Jonathan Wilder, Vaswani, Sharan, Babanezhad, Reza, Schmidt, Mark, Roux, Nicolas Le

We consider minimizing functions for which it is expensive to compute the (possibly stochastic) gradient. Such functions are prevalent in reinforcement learning, imitation learning and adversarial training. Our target optimization framework uses the

Externí odkaz: http://arxiv.org/abs/2302.02607

Zobrazit plný text záznamu

Report

Multi-Head Adapter Routing for Cross-Task Generalization

Autor: Caccia, Lucas, Ponti, Edoardo, Su, Zhan, Pereira, Matheus, Roux, Nicolas Le, Sordoni, Alessandro

Parameter-efficient fine-tuning (PEFT) for cross-task generalization consists in pre-training adapters on a multi-task training set before few-shot adaptation to test tasks. Polytropon [Ponti et al., 2023] ($\texttt{Poly}$) jointly learns an inventor

Externí odkaz: http://arxiv.org/abs/2211.03831

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání