Zobrazeno 1 - 10
of 964
pro vyhledávání: '"Khetarpal P"'
Autor:
Lyle, Clare, Zheng, Zeyu, Khetarpal, Khimya, Martens, James, van Hasselt, Hado, Pascanu, Razvan, Dabney, Will
Normalization layers have recently experienced a renaissance in the deep reinforcement learning and continual learning literature, with several works highlighting diverse benefits such as improving loss landscape conditioning and combatting overestim
Externí odkaz:
http://arxiv.org/abs/2407.01800
Autor:
Khetarpal, Khimya, Guo, Zhaohan Daniel, Pires, Bernardo Avila, Tang, Yunhao, Lyle, Clare, Rowland, Mark, Heess, Nicolas, Borsa, Diana, Guez, Arthur, Dabney, Will
Learning a good representation is a crucial challenge for Reinforcement Learning (RL) agents. Self-predictive learning provides means to jointly learn a latent representation and dynamics model by bootstrapping from future latent representations (BYO
Externí odkaz:
http://arxiv.org/abs/2406.02035
Autor:
Lyle, Clare, Zheng, Zeyu, Khetarpal, Khimya, van Hasselt, Hado, Pascanu, Razvan, Martens, James, Dabney, Will
Underpinning the past decades of work on the design, initialization, and optimization of neural networks is a seemingly innocuous assumption: that the network is trained on a \textit{stationary} data distribution. In settings where this assumption is
Externí odkaz:
http://arxiv.org/abs/2402.18762
Autor:
Sharma, Sugandha, Davidson, Guy, Khetarpal, Khimya, Kanervisto, Anssi, Arora, Udit, Hofmann, Katja, Momennejad, Ida
Achieving human-AI alignment in complex multi-agent games is crucial for creating trustworthy AI agents that enhance gameplay. We propose a method to evaluate this alignment using an interpretable task-sets framework, focusing on high-level behaviora
Externí odkaz:
http://arxiv.org/abs/2402.03575
The ability to plan at many different levels of abstraction enables agents to envision the long-term repercussions of their decisions and thus enables sample-efficient learning. This becomes particularly beneficial in complex environments from high-d
Externí odkaz:
http://arxiv.org/abs/2310.09997
Autor:
Nguyen, Tuan Dung, Ting, Yuan-Sen, Ciucă, Ioana, O'Neill, Charlie, Sun, Ze-Chang, Jabłońska, Maja, Kruk, Sandor, Perkowski, Ernest, Miller, Jack, Li, Jason, Peek, Josh, Iyer, Kartheik, Różański, Tomasz, Khetarpal, Pranav, Zaman, Sharaf, Brodrick, David, Méndez, Sergio J. Rodríguez, Bui, Thang, Goodman, Alyssa, Accomazzi, Alberto, Naiman, Jill, Cranney, Jesse, Schawinski, Kevin, UniverseTBD
Large language models excel in many human-language tasks but often falter in highly specialized domains like scholarly astronomy. To bridge this gap, we introduce AstroLLaMA, a 7-billion-parameter model fine-tuned from LLaMA-2 using over 300,000 astr
Externí odkaz:
http://arxiv.org/abs/2309.06126
Deep Reinforcement Learning has shown significant progress in extracting useful representations from high-dimensional inputs albeit using hand-crafted auxiliary tasks and pseudo rewards. Automatically learning such representations in an object-centri
Externí odkaz:
http://arxiv.org/abs/2304.13892
We study the problem of planning under model uncertainty in an online meta-reinforcement learning (RL) setting where an agent is presented with a sequence of related tasks with limited interactions per task. The agent can use its experience in each t
Externí odkaz:
http://arxiv.org/abs/2212.14530
Autor:
Ranjana Khetarpal, Veena Chatrath, Suparna Grover, Puneetpal Kaur, Ankita Taneja, Aishwarya Madaan
Publikováno v:
Journal of Obstetric Anaesthesia and Critical Care, Vol 14, Iss 1, Pp 45-53 (2024)
Background: Lumbar epidural analgesia is a safe, effective, and beneficial technique for both the parturient and the fetus. Dural puncture epidural has been claimed to be better than combined spinal epidural and epidural techniques by many. We undert
Externí odkaz:
https://doaj.org/article/1dbea4068d21408589739ef980ab9a99
Publikováno v:
IET Generation, Transmission & Distribution, Vol 18, Iss 1, Pp 50-62 (2024)
Abstract This paper proposes a recurrent neural network based model to segment and classify multiple combined multiple power quality disturbances (PQDs) from the PQD voltage signal. A modified bi‐directional long short‐term memory (BI‐LSTM) mod
Externí odkaz:
https://doaj.org/article/82581bfa1d554afbbf922e7908dde49f