Zobrazeno 1 - 10
of 1 709
pro vyhledávání: '"da Silva, Bruno"'
Disinformation on social media poses both societal and technical challenges. While previous studies have integrated textual information into propagation networks, they have yet to fully leverage the advancements in Transformer-based language models f
Externí odkaz:
http://arxiv.org/abs/2410.19193
Evaluating policies using off-policy data is crucial for applying reinforcement learning to real-world problems such as healthcare and autonomous driving. Previous methods for off-policy evaluation (OPE) generally suffer from high variance or irreduc
Externí odkaz:
http://arxiv.org/abs/2410.02172
Novel reinforcement learning algorithms, or improvements on existing ones, are commonly justified by evaluating their performance on benchmark environments and are compared to an ever-changing set of standard algorithms. However, despite numerous cal
Externí odkaz:
http://arxiv.org/abs/2406.16241
Autor:
Chaudhari, Shreyas, Aggarwal, Pranjal, Murahari, Vishvak, Rajpurohit, Tanmay, Kalyan, Ashwin, Narasimhan, Karthik, Deshpande, Ameet, da Silva, Bruno Castro
State-of-the-art large language models (LLMs) have become indispensable tools for various tasks. However, training LLMs to serve as effective assistants for humans requires careful consideration. A promising approach is reinforcement learning from hu
Externí odkaz:
http://arxiv.org/abs/2404.08555
Autor:
Gupta, Dhawal, Jordan, Scott M., Chaudhari, Shreyas, Liu, Bo, Thomas, Philip S., da Silva, Bruno Castro
In this paper, we introduce a fresh perspective on the challenges of credit assignment and policy evaluation. First, we delve into the nuances of eligibility traces and explore instances where their updates may result in unexpected credit assignment
Externí odkaz:
http://arxiv.org/abs/2312.12972
Designing reward functions for efficiently guiding reinforcement learning (RL) agents toward specific behaviors is a complex task. This is challenging since it requires the identification of reward structures that are not sparse and that avoid inadve
Externí odkaz:
http://arxiv.org/abs/2310.19007
Autor:
Melotti, Gledson, Bastos, Johann J. S., da Silva, Bruno L. S., Zanotelli, Tiago, Premebida, Cristiano
Object recognition is a crucial step in perception systems for autonomous and intelligent vehicles, as evidenced by the numerous research works in the topic. In this paper, object recognition is explored by using multisensory and multimodality approa
Externí odkaz:
http://arxiv.org/abs/2310.05951
Autor:
Kostas, James E., Jordan, Scott M., Chandak, Yash, Theocharous, Georgios, Gupta, Dhawal, White, Martha, da Silva, Bruno Castro, Thomas, Philip S.
Coagent networks for reinforcement learning (RL) [Thomas and Barto, 2011] provide a powerful and flexible framework for deriving principled learning rules for arbitrary stochastic neural networks. The coagent framework offers an alternative to backpr
Externí odkaz:
http://arxiv.org/abs/2305.09838
Autor:
Chandak, Yash, Shankar, Shiv, Bastian, Nathaniel D., da Silva, Bruno Castro, Brunskil, Emma, Thomas, Philip S.
Methods for sequential decision-making are often built upon a foundational assumption that the underlying decision process is stationary. This limits the application of such methods because real-world problems are often subject to changes due to exte
Externí odkaz:
http://arxiv.org/abs/2301.10330