Zobrazeno 1 - 10
of 1 733
pro vyhledávání: '"Silva, Bruno P."'
Evaluating policies using off-policy data is crucial for applying reinforcement learning to real-world problems such as healthcare and autonomous driving. Previous methods for off-policy evaluation (OPE) generally suffer from high variance or irreduc
Externí odkaz:
http://arxiv.org/abs/2410.02172
Novel reinforcement learning algorithms, or improvements on existing ones, are commonly justified by evaluating their performance on benchmark environments and are compared to an ever-changing set of standard algorithms. However, despite numerous cal
Externí odkaz:
http://arxiv.org/abs/2406.16241
The spread of microbial infections is governed by the self-organization of bacteria on surfaces. Limitations of live imaging techniques make collective behaviors in clinically relevant systems challenging to quantify. Here, novel experimental and ima
Externí odkaz:
http://arxiv.org/abs/2405.12407
Autor:
Chaudhari, Shreyas, Aggarwal, Pranjal, Murahari, Vishvak, Rajpurohit, Tanmay, Kalyan, Ashwin, Narasimhan, Karthik, Deshpande, Ameet, da Silva, Bruno Castro
State-of-the-art large language models (LLMs) have become indispensable tools for various tasks. However, training LLMs to serve as effective assistants for humans requires careful consideration. A promising approach is reinforcement learning from hu
Externí odkaz:
http://arxiv.org/abs/2404.08555
Autor:
Mecklenburg, Nick, Lin, Yiyou, Li, Xiaoxiao, Holstein, Daniel, Nunes, Leonardo, Malvar, Sara, Silva, Bruno, Chandra, Ranveer, Aski, Vijay, Yannam, Pavan Kumar Reddy, Aktas, Tolga, Hendry, Todd
In recent years, Large Language Models (LLMs) have shown remarkable performance in generating human-like text, proving to be a valuable asset across various applications. However, adapting these models to incorporate new, out-of-domain knowledge rema
Externí odkaz:
http://arxiv.org/abs/2404.00213
Autor:
Fernández-Rodríguez, Marcos, Silva, Bruno, Queirós, Sandro, Torres, Helena R., Oliveira, Bruno, Morais, Pedro, Buschle, Lukas R., Correia-Pinto, Jorge, Lima, Estevão, Vilaça, João L.
Publikováno v:
Proceedings Volume 12928, Medical Imaging 2024: Image-Guided Procedures, Robotic Interventions, and Modeling; 1292827 (2024)
Surgical instrument segmentation in laparoscopy is essential for computer-assisted surgical systems. Despite the Deep Learning progress in recent years, the dynamic setting of laparoscopic surgery still presents challenges for precise segmentation. T
Externí odkaz:
http://arxiv.org/abs/2403.10216
Autor:
Balaguer, Angels, Benara, Vinamra, Cunha, Renato Luiz de Freitas, Filho, Roberto de M. Estevão, Hendry, Todd, Holstein, Daniel, Marsman, Jennifer, Mecklenburg, Nick, Malvar, Sara, Nunes, Leonardo O., Padilha, Rafael, Sharp, Morris, Silva, Bruno, Sharma, Swati, Aski, Vijay, Chandra, Ranveer
There are two common ways in which developers are incorporating proprietary and domain-specific data when building applications of Large Language Models (LLMs): Retrieval-Augmented Generation (RAG) and Fine-Tuning. RAG augments the prompt with the ex
Externí odkaz:
http://arxiv.org/abs/2401.08406
Autor:
Gupta, Dhawal, Jordan, Scott M., Chaudhari, Shreyas, Liu, Bo, Thomas, Philip S., da Silva, Bruno Castro
In this paper, we introduce a fresh perspective on the challenges of credit assignment and policy evaluation. First, we delve into the nuances of eligibility traces and explore instances where their updates may result in unexpected credit assignment
Externí odkaz:
http://arxiv.org/abs/2312.12972
Designing reward functions for efficiently guiding reinforcement learning (RL) agents toward specific behaviors is a complex task. This is challenging since it requires the identification of reward structures that are not sparse and that avoid inadve
Externí odkaz:
http://arxiv.org/abs/2310.19007
Large language models (LLMs) have demonstrated remarkable capabilities in natural language understanding across various domains, including healthcare and finance. For some tasks, LLMs achieve similar or better performance than trained human beings, t
Externí odkaz:
http://arxiv.org/abs/2310.06225