Zobrazeno 1 - 10
of 201
pro vyhledávání: '"Poupart, Pascal"'
Autor:
Liu, Guiliang, Xu, Sheng, Liu, Shicheng, Gaurav, Ashish, Subramanian, Sriram Ganapathi, Poupart, Pascal
Inverse Constrained Reinforcement Learning (ICRL) is the task of inferring the implicit constraints followed by expert agents from their demonstration data. As an emerging research topic, ICRL has received considerable attention in recent years. This
Externí odkaz:
http://arxiv.org/abs/2409.07569
Autor:
Miao, Yanting, Loh, William, Kothawade, Suraj, Poupart, Pascal, Rashwan, Abdullah, Li, Yeqing
Text-to-image generative models have recently attracted considerable interest, enabling the synthesis of high-quality images from textual prompts. However, these models often lack the capability to generate specific subjects from given reference imag
Externí odkaz:
http://arxiv.org/abs/2407.12164
Federated representation learning (FRL) aims to learn personalized federated models with effective feature extraction from local data. FRL algorithms that share the majority of the model parameters face significant challenges with huge communication
Externí odkaz:
http://arxiv.org/abs/2407.08337
Autor:
Grosse, Julia, Wu, Ruotian, Rashid, Ahmad, Hennig, Philipp, Poupart, Pascal, Kristiadi, Agustinus
Tree search algorithms such as greedy and beam search are the standard when it comes to finding sequences of maximum likelihood in the decoding processes of large language models (LLMs). However, they are myopic since they do not take the complete ro
Externí odkaz:
http://arxiv.org/abs/2407.03951
Autor:
Subramanian, Sriram Ganapathi, Liu, Guiliang, Elmahgiubi, Mohammed, Rezaee, Kasra, Poupart, Pascal
In coming up with solutions to real-world problems, humans implicitly adhere to constraints that are too numerous and complex to be specified completely. However, reinforcement learning (RL) agents need these constraints to learn the correct optimal
Externí odkaz:
http://arxiv.org/abs/2406.16782
Large language models (LLMs) can significantly be improved by aligning to human preferences -- the so-called reinforcement learning from human feedback (RLHF). However, the cost of fine-tuning an LLM is prohibitive for many users. Due to their abilit
Externí odkaz:
http://arxiv.org/abs/2406.07780
Autor:
Kristiadi, Agustinus, Strieth-Kalthoff, Felix, Subramanian, Sriram Ganapathi, Fortuin, Vincent, Poupart, Pascal, Pleiss, Geoff
Bayesian optimization (BO) is an integral part of automated scientific discovery -- the so-called self-driving lab -- where human inputs are ideally minimal or at least non-blocking. However, scientists often have strong intuition, and thus human fee
Externí odkaz:
http://arxiv.org/abs/2406.06459
Reinforcement learning algorithms utilizing policy gradients (PG) to optimize Conditional Value at Risk (CVaR) face significant challenges with sample inefficiency, hindering their practical applications. This inefficiency stems from two main facts:
Externí odkaz:
http://arxiv.org/abs/2403.11062
Autor:
Schulte, Oliver, Poupart, Pascal
Reinforcement learning (RL) and causal modelling naturally complement each other. The goal of causal modelling is to predict the effects of interventions in an environment, while the goal of reinforcement learning is to select interventions that maxi
Externí odkaz:
http://arxiv.org/abs/2403.04221
Autor:
Kristiadi, Agustinus, Strieth-Kalthoff, Felix, Skreta, Marta, Poupart, Pascal, Aspuru-Guzik, Alán, Pleiss, Geoff
Automation is one of the cornerstones of contemporary material discovery. Bayesian optimization (BO) is an essential part of such workflows, enabling scientists to leverage prior domain knowledge into efficient exploration of a large molecular space.
Externí odkaz:
http://arxiv.org/abs/2402.05015