Zobrazeno 1 - 10
of 31
pro vyhledávání: '"Karimpanal, Thommen George"'
Deep reinforcement learning (RL) policies, although optimal in terms of task rewards, may not align with the personal preferences of human users. To ensure this alignment, a naive solution would be to retrain the agent using a reward function that en
Externí odkaz:
http://arxiv.org/abs/2409.20016
Autor:
Karimpanal, Thommen George, Semage, Laknath Buddhika, Rana, Santu, Le, Hung, Tran, Truyen, Gupta, Sunil, Venkatesh, Svetha
Large language models (LLMs) have recently demonstrated their impressive ability to provide context-aware responses via text. This ability could potentially be used to predict plausible solutions in sequential decision making tasks pertaining to patt
Externí odkaz:
http://arxiv.org/abs/2308.13542
Publikováno v:
AAMAS 2023
Autonomously learning diverse behaviors without an extrinsic reward signal has been a problem of interest in reinforcement learning. However, the nature of learning in such mechanisms is unconstrained, often resulting in the accumulation of several u
Externí odkaz:
http://arxiv.org/abs/2303.04592
Simulation based learning often provides a cost-efficient recourse to reinforcement learning applications in robotics. However, simulators are generally incapable of accurately replicating real-world dynamics, and thus bridging the sim2real gap is an
Externí odkaz:
http://arxiv.org/abs/2302.04013
Sim2real transfer is primarily concerned with transferring policies trained in simulation to potentially noisy real world environments. A common problem associated with sim2real transfer is estimating the real-world environmental parameters to ground
Externí odkaz:
http://arxiv.org/abs/2202.05844
Adapting an agent's behaviour to new environments has been one of the primary focus areas of physics based reinforcement learning. Although recent approaches such as universal policy networks partially address this issue by enabling the storage of mu
Externí odkaz:
http://arxiv.org/abs/2202.05843
Autor:
Karimpanal, Thommen George, Le, Hung, Abdolshah, Majid, Rana, Santu, Gupta, Sunil, Tran, Truyen, Venkatesh, Svetha
The optimistic nature of the Q-learning target leads to an overestimation bias, which is an inherent problem associated with standard $Q-$learning. Such a bias fails to account for the possibility of low returns, particularly in risky scenarios. Howe
Externí odkaz:
http://arxiv.org/abs/2111.02787
Physics-based reinforcement learning tasks can benefit from simplified physics simulators as they potentially allow near-optimal policies to be learned in simulation. However, such simulators require the latent factors (e.g. mass, friction coefficien
Externí odkaz:
http://arxiv.org/abs/2104.08795
Autor:
Karimpanal, Thommen George
The recent successes of deep learning and deep reinforcement learning have firmly established their statuses as state-of-the-art artificial learning techniques. However, longstanding drawbacks of these approaches, such as their poor sample efficienci
Externí odkaz:
http://arxiv.org/abs/2002.01088
Prior access to domain knowledge could significantly improve the performance of a reinforcement learning agent. In particular, it could help agents avoid potentially catastrophic exploratory actions, which would otherwise have to be experienced durin
Externí odkaz:
http://arxiv.org/abs/1909.04307