Výsledky vyhledávání

Report

Learning Soft Driving Constraints from Vectorized Scene Embeddings while Imitating Expert Trajectories

Autor: Mobarakeh, Niloufar Saeidi, Khamidehi, Behzad, Li, Chunlin, Mirkhani, Hamidreza, Arasteh, Fazel, Elmahgiubi, Mohammed, Zhang, Weize, Rezaee, Kasra, Poupart, Pascal

The primary goal of motion planning is to generate safe and efficient trajectories for vehicles. Traditionally, motion planning models are trained using imitation learning to mimic the behavior of human experts. However, these models often lack inter

Externí odkaz: http://arxiv.org/abs/2412.05717

Zobrazit plný text záznamu

Report

A Comprehensive Survey on Inverse Constrained Reinforcement Learning: Definitions, Progress and Challenges

Autor: Liu, Guiliang, Xu, Sheng, Liu, Shicheng, Gaurav, Ashish, Subramanian, Sriram Ganapathi, Poupart, Pascal

Inverse Constrained Reinforcement Learning (ICRL) is the task of inferring the implicit constraints followed by expert agents from their demonstration data. As an emerging research topic, ICRL has received considerable attention in recent years. This

Externí odkaz: http://arxiv.org/abs/2409.07569

Zobrazit plný text záznamu

Report

Subject-driven Text-to-Image Generation via Preference-based Reinforcement Learning

Autor: Miao, Yanting, Loh, William, Kothawade, Suraj, Poupart, Pascal, Rashwan, Abdullah, Li, Yeqing

Text-to-image generative models have recently attracted considerable interest, enabling the synthesis of high-quality images from textual prompts. However, these models often lack the capability to generate specific subjects from given reference imag

Externí odkaz: http://arxiv.org/abs/2407.12164

Zobrazit plný text záznamu

Report

FedLog: Personalized Federated Classification with Less Communication and More Flexibility

Autor: Yu, Haolin, Zhang, Guojun, Poupart, Pascal

Federated representation learning (FRL) aims to learn personalized federated models with effective feature extraction from local data. FRL algorithms that share the majority of the model parameters face significant challenges with huge communication

Externí odkaz: http://arxiv.org/abs/2407.08337

Zobrazit plný text záznamu

Report

Uncertainty-Guided Optimization on Large Language Model Search Trees

Autor: Grosse, Julia, Wu, Ruotian, Rashid, Ahmad, Hennig, Philipp, Poupart, Pascal, Kristiadi, Agustinus

Tree search algorithms such as greedy and beam search are the standard when it comes to finding sequences of maximum likelihood in the decoding processes of large language models (LLMs). However, they are myopic since they do not take the complete ro

Externí odkaz: http://arxiv.org/abs/2407.03951

Zobrazit plný text záznamu

Report

Confidence Aware Inverse Constrained Reinforcement Learning

Autor: Subramanian, Sriram Ganapathi, Liu, Guiliang, Elmahgiubi, Mohammed, Rezaee, Kasra, Poupart, Pascal

In coming up with solutions to real-world problems, humans implicitly adhere to constraints that are too numerous and complex to be specified completely. However, reinforcement learning (RL) agents need these constraints to learn the correct optimal

Externí odkaz: http://arxiv.org/abs/2406.16782

Zobrazit plný text záznamu

Report

A Critical Look At Tokenwise Reward-Guided Text Generation

Autor: Rashid, Ahmad, Wu, Ruotian, Grosse, Julia, Kristiadi, Agustinus, Poupart, Pascal

Large language models (LLMs) can significantly be improved by aligning to human preferences -- the so-called reinforcement learning from human feedback (RLHF). However, the cost of fine-tuning an LLM is prohibitive for many users. Due to their abilit

Externí odkaz: http://arxiv.org/abs/2406.07780

Zobrazit plný text záznamu

Report

How Useful is Intermittent, Asynchronous Expert Feedback for Bayesian Optimization?

Autor: Kristiadi, Agustinus, Strieth-Kalthoff, Felix, Subramanian, Sriram Ganapathi, Fortuin, Vincent, Poupart, Pascal, Pleiss, Geoff

Bayesian optimization (BO) is an integral part of automated scientific discovery -- the so-called self-driving lab -- where human inputs are ideally minimal or at least non-blocking. However, scientists often have strong intuition, and thus human fee

Externí odkaz: http://arxiv.org/abs/2406.06459

Zobrazit plný text záznamu

Report

Contrastive Sparse Autoencoders for Interpreting Planning of Chess-Playing Agents

Autor: Poupart, Yoann

AI led chess systems to a superhuman level, yet these systems heavily rely on black-box algorithms. This is unsustainable in ensuring transparency to the end-user, particularly when these systems are responsible for sensitive decision-making. Recent

Externí odkaz: http://arxiv.org/abs/2406.04028

Zobrazit plný text záznamu

Report

A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization

Autor: Luo, Yudong, Pan, Yangchen, Wang, Han, Torr, Philip, Poupart, Pascal

Reinforcement learning algorithms utilizing policy gradients (PG) to optimize Conditional Value at Risk (CVaR) face significant challenges with sample inefficiency, hindering their practical applications. This inefficiency stems from two main facts:

Externí odkaz: http://arxiv.org/abs/2403.11062

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání