Výsledky vyhledávání - "da Silva, Bruno"

Report

Enriching GNNs with Text Contextual Representations for Detecting Disinformation Campaigns on Social Media

Autor: da Silva, Bruno Croso Cunha, Ferraz, Thomas Palmeira, Lopes, Roseli De Deus

Disinformation on social media poses both societal and technical challenges. While previous studies have integrated textual information into propagation networks, they have yet to fully leverage the advancements in Transformer-based language models f

Externí odkaz: http://arxiv.org/abs/2410.19193

Zobrazit plný text záznamu

Report

Abstract Reward Processes: Leveraging State Abstraction for Consistent Off-Policy Evaluation

Autor: Chaudhari, Shreyas, Deshpande, Ameet, da Silva, Bruno Castro, Thomas, Philip S.

Evaluating policies using off-policy data is crucial for applying reinforcement learning to real-world problems such as healthcare and autonomous driving. Previous methods for off-policy evaluation (OPE) generally suffer from high variance or irreduc

Externí odkaz: http://arxiv.org/abs/2410.02172

Zobrazit plný text záznamu

Report

Position: Benchmarking is Limited in Reinforcement Learning Research

Autor: Jordan, Scott M., White, Adam, da Silva, Bruno Castro, White, Martha, Thomas, Philip S.

Novel reinforcement learning algorithms, or improvements on existing ones, are commonly justified by evaluating their performance on benchmark environments and are compared to an ever-changing set of standard algorithms. However, despite numerous cal

Externí odkaz: http://arxiv.org/abs/2406.16241

Zobrazit plný text záznamu

Report

RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs

Autor: Chaudhari, Shreyas, Aggarwal, Pranjal, Murahari, Vishvak, Rajpurohit, Tanmay, Kalyan, Ashwin, Narasimhan, Karthik, Deshpande, Ameet, da Silva, Bruno Castro

State-of-the-art large language models (LLMs) have become indispensable tools for various tasks. However, training LLMs to serve as effective assistants for humans requires careful consideration. A promising approach is reinforcement learning from hu

Externí odkaz: http://arxiv.org/abs/2404.08555

Zobrazit plný text záznamu

Report

From Past to Future: Rethinking Eligibility Traces

Autor: Gupta, Dhawal, Jordan, Scott M., Chaudhari, Shreyas, Liu, Bo, Thomas, Philip S., da Silva, Bruno Castro

In this paper, we introduce a fresh perspective on the challenges of credit assignment and policy evaluation. First, we delve into the nuances of eligibility traces and explore instances where their updates may result in unexpected credit assignment

Externí odkaz: http://arxiv.org/abs/2312.12972

Zobrazit plný text záznamu

Kniha

Manual de Hepatología Clínica. [elektronicky zdroj]

Autor: da Silva, Bruno Moreira

Externí odkaz: Kolekce e-knih KNAV (Registrovani uzivatele: plny text online 5 minut, dalsi pristup na vyzadani. Registered users: full text online 5 minutes, further access on request.)

Report

Behavior Alignment via Reward Function Optimization

Autor: Gupta, Dhawal, Chandak, Yash, Jordan, Scott M., Thomas, Philip S., da Silva, Bruno Castro

Designing reward functions for efficiently guiding reinforcement learning (RL) agents toward specific behaviors is a complex task. This is challenging since it requires the identification of reward structures that are not sparse and that avoid inadve

Externí odkaz: http://arxiv.org/abs/2310.19007

Zobrazit plný text záznamu

Report

Reducing the False Positive Rate Using Bayesian Inference in Autonomous Driving Perception

Autor: Melotti, Gledson, Bastos, Johann J. S., da Silva, Bruno L. S., Zanotelli, Tiago, Premebida, Cristiano

Object recognition is a crucial step in perception systems for autonomous and intelligent vehicles, as evidenced by the numerous research works in the topic. In this paper, object recognition is explored by using multisensory and multimodality approa

Externí odkaz: http://arxiv.org/abs/2310.05951

Zobrazit plný text záznamu

Report

Coagent Networks: Generalized and Scaled

Autor: Kostas, James E., Jordan, Scott M., Chandak, Yash, Theocharous, Georgios, Gupta, Dhawal, White, Martha, da Silva, Bruno Castro, Thomas, Philip S.

Coagent networks for reinforcement learning (RL) [Thomas and Barto, 2011] provide a powerful and flexible framework for deriving principled learning rules for arbitrary stochastic neural networks. The coagent framework offers an alternative to backpr

Externí odkaz: http://arxiv.org/abs/2305.09838

Zobrazit plný text záznamu

Report

Off-Policy Evaluation for Action-Dependent Non-Stationary Environments

Autor: Chandak, Yash, Shankar, Shiv, Bastian, Nathaniel D., da Silva, Bruno Castro, Brunskil, Emma, Thomas, Philip S.

Methods for sequential decision-making are often built upon a foundational assumption that the underlying decision process is stationary. This limits the application of such methods because real-world problems are often subject to changes due to exte

Externí odkaz: http://arxiv.org/abs/2301.10330

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání