Zobrazeno 1 - 10
of 722
pro vyhledávání: '"P. Poupart"'
Autor:
Mobarakeh, Niloufar Saeidi, Khamidehi, Behzad, Li, Chunlin, Mirkhani, Hamidreza, Arasteh, Fazel, Elmahgiubi, Mohammed, Zhang, Weize, Rezaee, Kasra, Poupart, Pascal
The primary goal of motion planning is to generate safe and efficient trajectories for vehicles. Traditionally, motion planning models are trained using imitation learning to mimic the behavior of human experts. However, these models often lack inter
Externí odkaz:
http://arxiv.org/abs/2412.05717
Autor:
Liu, Guiliang, Xu, Sheng, Liu, Shicheng, Gaurav, Ashish, Subramanian, Sriram Ganapathi, Poupart, Pascal
Inverse Constrained Reinforcement Learning (ICRL) is the task of inferring the implicit constraints followed by expert agents from their demonstration data. As an emerging research topic, ICRL has received considerable attention in recent years. This
Externí odkaz:
http://arxiv.org/abs/2409.07569
Autor:
Miao, Yanting, Loh, William, Kothawade, Suraj, Poupart, Pascal, Rashwan, Abdullah, Li, Yeqing
Text-to-image generative models have recently attracted considerable interest, enabling the synthesis of high-quality images from textual prompts. However, these models often lack the capability to generate specific subjects from given reference imag
Externí odkaz:
http://arxiv.org/abs/2407.12164
Federated representation learning (FRL) aims to learn personalized federated models with effective feature extraction from local data. FRL algorithms that share the majority of the model parameters face significant challenges with huge communication
Externí odkaz:
http://arxiv.org/abs/2407.08337
Autor:
Grosse, Julia, Wu, Ruotian, Rashid, Ahmad, Hennig, Philipp, Poupart, Pascal, Kristiadi, Agustinus
Tree search algorithms such as greedy and beam search are the standard when it comes to finding sequences of maximum likelihood in the decoding processes of large language models (LLMs). However, they are myopic since they do not take the complete ro
Externí odkaz:
http://arxiv.org/abs/2407.03951
Autor:
Subramanian, Sriram Ganapathi, Liu, Guiliang, Elmahgiubi, Mohammed, Rezaee, Kasra, Poupart, Pascal
In coming up with solutions to real-world problems, humans implicitly adhere to constraints that are too numerous and complex to be specified completely. However, reinforcement learning (RL) agents need these constraints to learn the correct optimal
Externí odkaz:
http://arxiv.org/abs/2406.16782
Large language models (LLMs) can significantly be improved by aligning to human preferences -- the so-called reinforcement learning from human feedback (RLHF). However, the cost of fine-tuning an LLM is prohibitive for many users. Due to their abilit
Externí odkaz:
http://arxiv.org/abs/2406.07780
Autor:
Kristiadi, Agustinus, Strieth-Kalthoff, Felix, Subramanian, Sriram Ganapathi, Fortuin, Vincent, Poupart, Pascal, Pleiss, Geoff
Bayesian optimization (BO) is an integral part of automated scientific discovery -- the so-called self-driving lab -- where human inputs are ideally minimal or at least non-blocking. However, scientists often have strong intuition, and thus human fee
Externí odkaz:
http://arxiv.org/abs/2406.06459
Autor:
Poupart, Yoann
AI led chess systems to a superhuman level, yet these systems heavily rely on black-box algorithms. This is unsustainable in ensuring transparency to the end-user, particularly when these systems are responsible for sensitive decision-making. Recent
Externí odkaz:
http://arxiv.org/abs/2406.04028
Reinforcement learning algorithms utilizing policy gradients (PG) to optimize Conditional Value at Risk (CVaR) face significant challenges with sample inefficiency, hindering their practical applications. This inefficiency stems from two main facts:
Externí odkaz:
http://arxiv.org/abs/2403.11062