Zobrazeno 1 - 10
of 1 134
pro vyhledávání: '"Bhambri A"'
Autor:
Gundawar, Atharva, Verma, Mudit, Guan, Lin, Valmeekam, Karthik, Bhambri, Siddhant, Kambhampati, Subbarao
As the applicability of Large Language Models (LLMs) extends beyond traditional text processing tasks, there is a burgeoning interest in their potential to excel in planning and reasoning assignments, realms traditionally reserved for System 2 cognit
Externí odkaz:
http://arxiv.org/abs/2405.20625
Autor:
Bhambri, Siddhant, Bhattacharjee, Amrita, Kalwar, Durgesh, Guan, Lin, Liu, Huan, Kambhampati, Subbarao
Reinforcement Learning (RL) suffers from sample inefficiency in sparse reward domains, and the problem is further pronounced in case of stochastic transitions. To improve the sample efficiency, reward shaping is a well-studied approach to introduce i
Externí odkaz:
http://arxiv.org/abs/2405.15194
The reasoning abilities of Large Language Models (LLMs) remain a topic of debate. Some methods such as ReAct-based prompting, have gained popularity for claiming to enhance sequential decision-making abilities of agentic LLMs. However, it is unclear
Externí odkaz:
http://arxiv.org/abs/2405.13966
Autor:
Kambhampati, Subbarao, Valmeekam, Karthik, Guan, Lin, Verma, Mudit, Stechly, Kaya, Bhambri, Siddhant, Saldyt, Lucas, Murthy, Anil
Publikováno v:
Proceedings of the 41 st International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024
There is considerable confusion about the role of Large Language Models (LLMs) in planning and reasoning tasks. On one side are over-optimistic claims that LLMs can indeed do these tasks with just the right prompting or self-verification strategies.
Externí odkaz:
http://arxiv.org/abs/2402.01817
Large Language Models have shown exceptional generative abilities in various natural language and generation tasks. However, possible anthropomorphization and leniency towards failure cases have propelled discussions on emergent abilities of Large La
Externí odkaz:
http://arxiv.org/abs/2401.05302
Preference-based Reinforcement Learning (PbRL) has made significant strides in single-agent settings, but has not been studied for multi-agent frameworks. On the other hand, modeling cooperation between multiple agents, specifically, Human-AI Teaming
Externí odkaz:
http://arxiv.org/abs/2312.14292
Autor:
Sanyal, Sunandini, Asokan, Ashish Ramayee, Bhambri, Suvaansh, Kulkarni, Akshay, Kundu, Jogendra Nath, Babu, R. Venkatesh
Conventional Domain Adaptation (DA) methods aim to learn domain-invariant feature representations to improve the target adaptation performance. However, we motivate that domain-specificity is equally important since in-domain trained models hold cruc
Externí odkaz:
http://arxiv.org/abs/2308.14023
Robotic agents performing domestic chores by natural language directives are required to master the complex job of navigating environment and interacting with objects in the environments. The tasks given to the agents are often composite thus are cha
Externí odkaz:
http://arxiv.org/abs/2308.09387
Preference Based Reinforcement Learning has shown much promise for utilizing human binary feedback on queried trajectory pairs to recover the underlying reward model of the Human in the Loop (HiL). While works have attempted to better utilize the que
Externí odkaz:
http://arxiv.org/abs/2302.08738
In this paper we address the solution of the popular Wordle puzzle, using new reinforcement learning methods, which apply more generally to adaptive control of dynamic systems and to classes of Partially Observable Markov Decision Process (POMDP) pro
Externí odkaz:
http://arxiv.org/abs/2211.10298