Výsledky vyhledávání - "Poupart, Pascal"

Report

A Comprehensive Survey on Inverse Constrained Reinforcement Learning: Definitions, Progress and Challenges

Autor: Liu, Guiliang, Xu, Sheng, Liu, Shicheng, Gaurav, Ashish, Subramanian, Sriram Ganapathi, Poupart, Pascal

Inverse Constrained Reinforcement Learning (ICRL) is the task of inferring the implicit constraints followed by expert agents from their demonstration data. As an emerging research topic, ICRL has received considerable attention in recent years. This

Externí odkaz: http://arxiv.org/abs/2409.07569

Zobrazit plný text záznamu

Report

Subject-driven Text-to-Image Generation via Preference-based Reinforcement Learning

Autor: Miao, Yanting, Loh, William, Kothawade, Suraj, Poupart, Pascal, Rashwan, Abdullah, Li, Yeqing

Text-to-image generative models have recently attracted considerable interest, enabling the synthesis of high-quality images from textual prompts. However, these models often lack the capability to generate specific subjects from given reference imag

Externí odkaz: http://arxiv.org/abs/2407.12164

Zobrazit plný text záznamu

Report

FedLog: Personalized Federated Classification with Less Communication and More Flexibility

Autor: Yu, Haolin, Zhang, Guojun, Poupart, Pascal

Federated representation learning (FRL) aims to learn personalized federated models with effective feature extraction from local data. FRL algorithms that share the majority of the model parameters face significant challenges with huge communication

Externí odkaz: http://arxiv.org/abs/2407.08337

Zobrazit plný text záznamu

Report

Uncertainty-Guided Optimization on Large Language Model Search Trees

Autor: Grosse, Julia, Wu, Ruotian, Rashid, Ahmad, Hennig, Philipp, Poupart, Pascal, Kristiadi, Agustinus

Tree search algorithms such as greedy and beam search are the standard when it comes to finding sequences of maximum likelihood in the decoding processes of large language models (LLMs). However, they are myopic since they do not take the complete ro

Externí odkaz: http://arxiv.org/abs/2407.03951

Zobrazit plný text záznamu

Report

Confidence Aware Inverse Constrained Reinforcement Learning

Autor: Subramanian, Sriram Ganapathi, Liu, Guiliang, Elmahgiubi, Mohammed, Rezaee, Kasra, Poupart, Pascal

In coming up with solutions to real-world problems, humans implicitly adhere to constraints that are too numerous and complex to be specified completely. However, reinforcement learning (RL) agents need these constraints to learn the correct optimal

Externí odkaz: http://arxiv.org/abs/2406.16782

Zobrazit plný text záznamu

Report

A Critical Look At Tokenwise Reward-Guided Text Generation

Autor: Rashid, Ahmad, Wu, Ruotian, Grosse, Julia, Kristiadi, Agustinus, Poupart, Pascal

Large language models (LLMs) can significantly be improved by aligning to human preferences -- the so-called reinforcement learning from human feedback (RLHF). However, the cost of fine-tuning an LLM is prohibitive for many users. Due to their abilit

Externí odkaz: http://arxiv.org/abs/2406.07780

Zobrazit plný text záznamu

Report

How Useful is Intermittent, Asynchronous Expert Feedback for Bayesian Optimization?

Autor: Kristiadi, Agustinus, Strieth-Kalthoff, Felix, Subramanian, Sriram Ganapathi, Fortuin, Vincent, Poupart, Pascal, Pleiss, Geoff

Bayesian optimization (BO) is an integral part of automated scientific discovery -- the so-called self-driving lab -- where human inputs are ideally minimal or at least non-blocking. However, scientists often have strong intuition, and thus human fee

Externí odkaz: http://arxiv.org/abs/2406.06459

Zobrazit plný text záznamu

Report

A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization

Autor: Luo, Yudong, Pan, Yangchen, Wang, Han, Torr, Philip, Poupart, Pascal

Reinforcement learning algorithms utilizing policy gradients (PG) to optimize Conditional Value at Risk (CVaR) face significant challenges with sample inefficiency, hindering their practical applications. This inefficiency stems from two main facts:

Externí odkaz: http://arxiv.org/abs/2403.11062

Zobrazit plný text záznamu

Report

Why Online Reinforcement Learning is Causal

Autor: Schulte, Oliver, Poupart, Pascal

Reinforcement learning (RL) and causal modelling naturally complement each other. The goal of causal modelling is to predict the effects of interventions in an environment, while the goal of reinforcement learning is to select interventions that maxi

Externí odkaz: http://arxiv.org/abs/2403.04221

Zobrazit plný text záznamu

Report

A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules?

Autor: Kristiadi, Agustinus, Strieth-Kalthoff, Felix, Skreta, Marta, Poupart, Pascal, Aspuru-Guzik, Alán, Pleiss, Geoff

Automation is one of the cornerstones of contemporary material discovery. Bayesian optimization (BO) is an essential part of such workflows, enabling scientists to leverage prior domain knowledge into efficient exploration of a large molecular space.

Externí odkaz: http://arxiv.org/abs/2402.05015

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání