Zobrazeno 1 - 10
of 37
pro vyhledávání: '"Piché, Alexandre"'
In order to be deployed safely, Large Language Models (LLMs) must be capable of dynamically adapting their behavior based on their level of knowledge and uncertainty associated with specific topics. This adaptive behavior, which we refer to as self-r
Externí odkaz:
http://arxiv.org/abs/2405.13022
Autor:
Long, Stephanie, Piché, Alexandre, Zantedeschi, Valentina, Schuster, Tibor, Drouin, Alexandre
Understanding the causal relationships that underlie a system is a fundamental prerequisite to accurate decision-making. In this work, we explore how expert knowledge can be used to improve the data-driven identification of causal graphs, beyond Mark
Externí odkaz:
http://arxiv.org/abs/2307.02390
Building causal graphs can be a laborious process. To ensure all relevant causal pathways have been captured, researchers often have to discuss with clinicians and experts while also reviewing extensive relevant medical literature. By encoding common
Externí odkaz:
http://arxiv.org/abs/2303.05279
In model-based optimisation (MBO) we are interested in using machine learning to design candidates that maximise some measure of reward with respect to a black box function called the (ground truth) oracle, which is expensive to compute since it invo
Externí odkaz:
http://arxiv.org/abs/2211.10747
Autor:
Piche, Alexandre, Thomas, Valentin, Marino, Joseph, Pardinas, Rafael, Marconi, Gian Maria, Pal, Christopher, Khan, Mohammad Emtiyaz
Bootstrapping is behind much of the successes of Deep Reinforcement Learning. However, learning the value function via bootstrapping often leads to unstable training due to fast-changing target values. Target Networks are employed to stabilize traini
Externí odkaz:
http://arxiv.org/abs/2210.12282
Offline Reinforcement Learning (RL) via Supervised Learning is a simple and effective way to learn robotic skills from a dataset collected by policies of different expertise levels. It is as simple as supervised learning and Behavior Cloning (BC), bu
Externí odkaz:
http://arxiv.org/abs/2210.12272
Autor:
Rajeswar, Sai, Mazzaglia, Pietro, Verbelen, Tim, Piché, Alexandre, Dhoedt, Bart, Courville, Aaron, Lacoste, Alexandre
Controlling artificial agents from visual sensory data is an arduous task. Reinforcement learning (RL) algorithms can succeed but require large amounts of interactions between the agent and the environment. To alleviate the issue, unsupervised RL pro
Externí odkaz:
http://arxiv.org/abs/2209.12016
Autor:
Piché, Alexandre, Thomas, Valentin, Pardinas, Rafael, Marino, Joseph, Marconi, Gian Maria, Pal, Christopher, Khan, Mohammad Emtiyaz
Bootstrapping is behind much of the successes of deep Reinforcement Learning. However, learning the value function via bootstrapping often leads to unstable training due to fast-changing target values. Target Networks are employed to stabilize traini
Externí odkaz:
http://arxiv.org/abs/2106.02613
Policy networks are a central feature of deep reinforcement learning (RL) algorithms for continuous control, enabling the estimation and sampling of high-value actions. From the variational inference perspective on RL, policy networks, when used with
Externí odkaz:
http://arxiv.org/abs/2010.10670