Výsledky vyhledávání - "Bastani, Osbert"

Report

Zeroth-Order Fine-Tuning of LLMs with Extreme Sparsity

Autor: Guo, Wentao, Long, Jikai, Zeng, Yimeng, Liu, Zirui, Yang, Xinyu, Ran, Yide, Gardner, Jacob R., Bastani, Osbert, De Sa, Christopher, Yu, Xiaodong, Chen, Beidi, Xu, Zhaozhuo

Zeroth-order optimization (ZO) is a memory-efficient strategy for fine-tuning Large Language Models using only forward passes. However, the application of ZO fine-tuning in memory-constrained settings such as mobile phones and laptops is still challe

Externí odkaz: http://arxiv.org/abs/2406.02913

Zobrazit plný text záznamu

Report

DrEureka: Language Model Guided Sim-To-Real Transfer

Autor: Ma, Yecheng Jason, Liang, William, Wang, Hung-Ju, Wang, Sam, Zhu, Yuke, Fan, Linxi, Bastani, Osbert, Jayaraman, Dinesh

Transferring policies learned in simulation to the real world is a promising strategy for acquiring robot skills at scale. However, sim-to-real approaches typically rely on manual design and tuning of the task reward function as well as the simulatio

Externí odkaz: http://arxiv.org/abs/2406.01967

Zobrazit plný text záznamu

Report

One-Shot Safety Alignment for Large Language Models via Optimal Dualization

Autor: Huang, Xinmeng, Li, Shuo, Dobriban, Edgar, Bastani, Osbert, Hassani, Hamed, Ding, Dongsheng

The growing safety concerns surrounding Large Language Models (LLMs) raise an urgent need to align them with diverse human preferences to simultaneously enhance their helpfulness and safety. A promising approach is to enforce safety constraints throu

Externí odkaz: http://arxiv.org/abs/2405.19544

Zobrazit plný text záznamu

Report

Uncertainty Quantification for Neurosymbolic Programs via Compositional Conformal Prediction

Autor: Ramalingam, Ramya, Park, Sangdon, Bastani, Osbert

Machine learning has become an effective tool for automatically annotating unstructured data (e.g., images) with structured labels (e.g., object detections). As a result, a new programming paradigm called neurosymbolic programming has emerged where u

Externí odkaz: http://arxiv.org/abs/2405.15912

Zobrazit plný text záznamu

Report

Stochastic Online Conformal Prediction with Semi-Bandit Feedback

Autor: Ge, Haosen, Bastani, Hamsa, Bastani, Osbert

Conformal prediction has emerged as an effective strategy for uncertainty quantification by modifying a model to output sets of labels instead of a single label. These prediction sets come with the guarantee that they contain the true label with high

Externí odkaz: http://arxiv.org/abs/2405.13268

Zobrazit plný text záznamu

Report

An Opportunistically Parallel Lambda Calculus for Performant Composition of Large Language Models

Autor: Mell, Stephen, Zdancewic, Steve, Bastani, Osbert

Large language models (LLMs) have shown impressive results at a wide-range of tasks. However, they have limitations, such as hallucinating facts and struggling with arithmetic. Recent work has addressed these issues with sophisticated decoding techni

Externí odkaz: http://arxiv.org/abs/2405.11361

Zobrazit plný text záznamu

Report

Stochastic Bandits with ReLU Neural Networks

Autor: Xu, Kan, Bastani, Hamsa, Goel, Surbhi, Bastani, Osbert

We study the stochastic bandit problem with ReLU neural network structure. We show that a $\tilde{O}(\sqrt{T})$ regret guarantee is achievable by considering bandits with one-layer ReLU neural networks; to the best of our knowledge, our work is the f

Externí odkaz: http://arxiv.org/abs/2405.07331

Zobrazit plný text záznamu

Report

Uncertainty in Language Models: Assessment through Rank-Calibration

Autor: Huang, Xinmeng, Li, Shuo, Yu, Mengxin, Sesia, Matteo, Hassani, Hamed, Lee, Insup, Bastani, Osbert, Dobriban, Edgar

Language Models (LMs) have shown promising performance in natural language generation. However, as LMs often generate incorrect or hallucinated responses, it is crucial to correctly quantify their uncertainty in responding to given inputs. In additio

Externí odkaz: http://arxiv.org/abs/2404.03163

Zobrazit plný text záznamu

Report

DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

Autor: Khazatsky, Alexander, Pertsch, Karl, Nair, Suraj, Balakrishna, Ashwin, Dasari, Sudeep, Karamcheti, Siddharth, Nasiriany, Soroush, Srirama, Mohan Kumar, Chen, Lawrence Yunliang, Ellis, Kirsty, Fagan, Peter David, Hejna, Joey, Itkina, Masha, Lepert, Marion, Ma, Yecheng Jason, Miller, Patrick Tree, Wu, Jimmy, Belkhale, Suneel, Dass, Shivin, Ha, Huy, Jain, Arhan, Lee, Abraham, Lee, Youngwoon, Memmel, Marius, Park, Sungjae, Radosavovic, Ilija, Wang, Kaiyuan, Zhan, Albert, Black, Kevin, Chi, Cheng, Hatch, Kyle Beltran, Lin, Shan, Lu, Jingpei, Mercat, Jean, Rehman, Abdul, Sanketi, Pannag R, Sharma, Archit, Simpson, Cody, Vuong, Quan, Walke, Homer Rich, Wulfe, Blake, Xiao, Ted, Yang, Jonathan Heewon, Yavary, Arefeh, Zhao, Tony Z., Agia, Christopher, Baijal, Rohan, Castro, Mateo Guaman, Chen, Daphne, Chen, Qiuyu, Chung, Trinity, Drake, Jaimyn, Foster, Ethan Paul, Gao, Jensen, Herrera, David Antonio, Heo, Minho, Hsu, Kyle, Hu, Jiaheng, Jackson, Donovon, Le, Charlotte, Li, Yunshuang, Lin, Kevin, Lin, Roy, Ma, Zehan, Maddukuri, Abhiram, Mirchandani, Suvir, Morton, Daniel, Nguyen, Tony, O'Neill, Abigail, Scalise, Rosario, Seale, Derick, Son, Victor, Tian, Stephen, Tran, Emi, Wang, Andrew E., Wu, Yilin, Xie, Annie, Yang, Jingyun, Yin, Patrick, Zhang, Yunchu, Bastani, Osbert, Berseth, Glen, Bohg, Jeannette, Goldberg, Ken, Gupta, Abhinav, Gupta, Abhishek, Jayaraman, Dinesh, Lim, Joseph J, Malik, Jitendra, Martín-Martín, Roberto, Ramamoorthy, Subramanian, Sadigh, Dorsa, Song, Shuran, Wu, Jiajun, Yip, Michael C., Zhu, Yuke, Kollar, Thomas, Levine, Sergey, Finn, Chelsea

The creation of large, diverse, high-quality robot manipulation datasets is an important stepping stone on the path toward more capable and robust robotic manipulation policies. However, creating such datasets is challenging: collecting robot manipul

Externí odkaz: http://arxiv.org/abs/2403.12945

Zobrazit plný text záznamu

Report

Generative Adversarial Bayesian Optimization for Surrogate Objectives

Autor: Yao, Michael S., Zeng, Yimeng, Bastani, Hamsa, Gardner, Jacob, Gee, James C., Bastani, Osbert

Offline model-based policy optimization seeks to optimize a learned surrogate objective function without querying the true oracle objective during optimization. However, inaccurate surrogate model predictions are frequently encountered along the opti

Externí odkaz: http://arxiv.org/abs/2402.06532

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání