Zobrazeno 1 - 10
of 38
pro vyhledávání: '"Rajendran, Janarthanan"'
Autor:
Bouchoucha, Rached, Yahmed, Ahmed Haj, Patil, Darshan, Rajendran, Janarthanan, Nikanjam, Amin, Chandar, Sarath, Khomh, Foutse
Deep reinforcement learning (DRL) has shown success in diverse domains such as robotics, computer games, and recommendation systems. However, like any other software system, DRL-based software systems are susceptible to faults that pose unique challe
Externí odkaz:
http://arxiv.org/abs/2410.04322
Autor:
Rosati, Domenic, Edkins, Giles, Raj, Harsh, Atanasov, David, Majumdar, Subhabrata, Rajendran, Janarthanan, Rudzicz, Frank, Sajjad, Hassan
While there has been progress towards aligning Large Language Models (LLMs) with human values and ensuring safe behaviour at inference time, safety-aligned LLMs are known to be vulnerable to training-time attacks such as supervised fine-tuning (SFT)
Externí odkaz:
http://arxiv.org/abs/2409.12914
Publikováno v:
Transactions of the Association for Computational Linguistics, Vol 7, Pp 375-386 (2019)
Neural end-to-end goal-oriented dialog systems showed promise to reduce the workload of human agents for customer service, as well as reduce wait time for users. However, their inability to handle new user behavior at deployment has limited their usa
Externí odkaz:
https://doaj.org/article/06ee39ed9d7443a3b56f2cbda7b1279c
In the real world, the strong episode resetting mechanisms that are needed to train agents in simulation are unavailable. The \textit{resetting} assumption limits the potential of reinforcement learning in the real world, as providing resets to an ag
Externí odkaz:
http://arxiv.org/abs/2405.01684
The use of dynamic pricing by profit-maximizing firms gives rise to demand fairness concerns, measured by discrepancies in consumer groups' demand responses to a given pricing strategy. Notably, dynamic pricing may result in buyer distributions unref
Externí odkaz:
http://arxiv.org/abs/2404.14620
Current model-based reinforcement learning (MBRL) agents struggle with long-term dependencies. This limits their ability to effectively solve tasks involving extended time gaps between actions and outcomes, or tasks demanding the recalling of distant
Externí odkaz:
http://arxiv.org/abs/2403.04253
Autor:
Sudhakar, Arjun Vaithilingam, Parthasarathi, Prasanna, Rajendran, Janarthanan, Chandar, Sarath
Large Language Models (LLMs) have demonstrated superior performance in language understanding benchmarks. CALM, a popular approach, leverages linguistic priors of LLMs -- GPT-2 -- for action candidate recommendations to improve the performance in tex
Externí odkaz:
http://arxiv.org/abs/2311.07687
Cooperative Multi-agent Reinforcement Learning (MARL) algorithms with Zero-Shot Coordination (ZSC) have gained significant attention in recent years. ZSC refers to the ability of agents to coordinate zero-shot (without additional interaction experien
Externí odkaz:
http://arxiv.org/abs/2308.10284
Efficient exploration is critical in cooperative deep Multi-Agent Reinforcement Learning (MARL). In this work, we propose an exploration method that effectively encourages cooperative exploration based on the idea of sequential action-computation sch
Externí odkaz:
http://arxiv.org/abs/2303.09032
Autor:
Rahimi-Kalahroudi, Ali, Rajendran, Janarthanan, Momennejad, Ida, van Seijen, Harm, Chandar, Sarath
One of the key behavioral characteristics used in neuroscience to determine whether the subject of study -- be it a rodent or a human -- exhibits model-based learning is effective adaptation to local changes in the environment, a particular form of a
Externí odkaz:
http://arxiv.org/abs/2303.08690