Výsledky vyhledávání - "Tajwar, Fahim"

Report

Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data

Autor: Tajwar, Fahim, Singh, Anikait, Sharma, Archit, Rafailov, Rafael, Schneider, Jeff, Xie, Tengyang, Ermon, Stefano, Finn, Chelsea, Kumar, Aviral

Learning from preference labels plays a crucial role in fine-tuning large language models. There are several distinct approaches for preference fine-tuning, including supervised learning, on-policy reinforcement learning (RL), and contrastive learnin

Externí odkaz: http://arxiv.org/abs/2404.14367

Zobrazit plný text záznamu

Report

Offline Retraining for Online RL: Decoupled Policy Learning to Mitigate Exploration Bias

Autor: Mark, Max Sobol, Sharma, Archit, Tajwar, Fahim, Rafailov, Rafael, Levine, Sergey, Finn, Chelsea

It is desirable for policies to optimistically explore new states and behaviors during online reinforcement learning (RL) or fine-tuning, especially when prior offline data does not provide enough state coverage. However, exploration bonuses can bias

Externí odkaz: http://arxiv.org/abs/2310.08558

Zobrazit plný text záznamu

Report

Conservative Prediction via Data-Driven Confidence Minimization

Autor: Choi, Caroline, Tajwar, Fahim, Lee, Yoonho, Yao, Huaxiu, Kumar, Ananya, Finn, Chelsea

In safety-critical applications of machine learning, it is often desirable for a model to be conservative, abstaining from making predictions on unknown inputs which are not well-represented in the training data. However, detecting unknown examples i

Externí odkaz: http://arxiv.org/abs/2306.04974

Zobrazit plný text záznamu

Report

Surgical Fine-Tuning Improves Adaptation to Distribution Shifts

Autor: Lee, Yoonho, Chen, Annie S., Tajwar, Fahim, Kumar, Ananya, Yao, Huaxiu, Liang, Percy, Finn, Chelsea

A common approach to transfer learning under distribution shift is to fine-tune the last few layers of a pre-trained model, preserving learned features while also adapting to the new task. This paper shows that in such settings, selectively fine-tuni

Externí odkaz: http://arxiv.org/abs/2210.11466

Zobrazit plný text záznamu

Report

When to Ask for Help: Proactive Interventions in Autonomous Reinforcement Learning

Autor: Xie, Annie, Tajwar, Fahim, Sharma, Archit, Finn, Chelsea

A long-term goal of reinforcement learning is to design agents that can autonomously interact and learn in the world. A critical challenge to such autonomy is the presence of irreversible states which require external assistance to recover from, such

Externí odkaz: http://arxiv.org/abs/2210.10765

Zobrazit plný text záznamu

Report

Do Deep Networks Transfer Invariances Across Classes?

Autor: Zhou, Allan, Tajwar, Fahim, Robey, Alexander, Knowles, Tom, Pappas, George J., Hassani, Hamed, Finn, Chelsea

To generalize well, classifiers must learn to be invariant to nuisance transformations that do not alter an input's class. Many problems have "class-agnostic" nuisance transformations that apply similarly to all classes, such as lighting and backgrou

Externí odkaz: http://arxiv.org/abs/2203.09739

Zobrazit plný text záznamu

Report

No True State-of-the-Art? OOD Detection Methods are Inconsistent across Datasets

Autor: Tajwar, Fahim, Kumar, Ananya, Xie, Sang Michael, Liang, Percy

Out-of-distribution detection is an important component of reliable ML systems. Prior literature has proposed various methods (e.g., MSP (Hendrycks & Gimpel, 2017), ODIN (Liang et al., 2018), Mahalanobis (Lee et al., 2018)), claiming they are state-o

Externí odkaz: http://arxiv.org/abs/2109.05554

Zobrazit plný text záznamu

Akademický článek

Scalable deep learning to identify brick kilns and aid regulatory capacity

Autor: Lee, Jihyeon, Brooks, Nina R., Tajwar, Fahim, Burke, Marshall, Ermon, Stefano, Lobell, David B., Biswas, Debashish, Luby, Stephen P.

Publikováno v: Proceedings of the National Academy of Sciences of the United States of America, 2021 Apr . 118(17), 1-10.

Externí odkaz: https://www.jstor.org/stable/27040176

Zobrazit plný text záznamu

Akademický článek

Scalable deep learning to identify brick kilns and aid regulatory capacity.

Autor: Jihyeon Lee, Brooks, Nina R., Tajwar, Fahim, Burke, Marshall, Ermon, Stefano, Lobell, David B., Biswas, Debashish, Luby, Stephen P.

Publikováno v: Proceedings of the National Academy of Sciences of the United States of America; 4/27/2021, Vol. 118 Issue 17, p1-10, 10p

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání