Zobrazeno 1 - 10
of 151
pro vyhledávání: '"Bastani, Osbert"'
Autor:
Guo, Wentao, Long, Jikai, Zeng, Yimeng, Liu, Zirui, Yang, Xinyu, Ran, Yide, Gardner, Jacob R., Bastani, Osbert, De Sa, Christopher, Yu, Xiaodong, Chen, Beidi, Xu, Zhaozhuo
Zeroth-order optimization (ZO) is a memory-efficient strategy for fine-tuning Large Language Models using only forward passes. However, the application of ZO fine-tuning in memory-constrained settings such as mobile phones and laptops is still challe
Externí odkaz:
http://arxiv.org/abs/2406.02913
Autor:
Ma, Yecheng Jason, Liang, William, Wang, Hung-Ju, Wang, Sam, Zhu, Yuke, Fan, Linxi, Bastani, Osbert, Jayaraman, Dinesh
Transferring policies learned in simulation to the real world is a promising strategy for acquiring robot skills at scale. However, sim-to-real approaches typically rely on manual design and tuning of the task reward function as well as the simulatio
Externí odkaz:
http://arxiv.org/abs/2406.01967
The growing safety concerns surrounding Large Language Models (LLMs) raise an urgent need to align them with diverse human preferences to simultaneously enhance their helpfulness and safety. A promising approach is to enforce safety constraints throu
Externí odkaz:
http://arxiv.org/abs/2405.19544
Machine learning has become an effective tool for automatically annotating unstructured data (e.g., images) with structured labels (e.g., object detections). As a result, a new programming paradigm called neurosymbolic programming has emerged where u
Externí odkaz:
http://arxiv.org/abs/2405.15912
Conformal prediction has emerged as an effective strategy for uncertainty quantification by modifying a model to output sets of labels instead of a single label. These prediction sets come with the guarantee that they contain the true label with high
Externí odkaz:
http://arxiv.org/abs/2405.13268
Large language models (LLMs) have shown impressive results at a wide-range of tasks. However, they have limitations, such as hallucinating facts and struggling with arithmetic. Recent work has addressed these issues with sophisticated decoding techni
Externí odkaz:
http://arxiv.org/abs/2405.11361
We study the stochastic bandit problem with ReLU neural network structure. We show that a $\tilde{O}(\sqrt{T})$ regret guarantee is achievable by considering bandits with one-layer ReLU neural networks; to the best of our knowledge, our work is the f
Externí odkaz:
http://arxiv.org/abs/2405.07331
Autor:
Huang, Xinmeng, Li, Shuo, Yu, Mengxin, Sesia, Matteo, Hassani, Hamed, Lee, Insup, Bastani, Osbert, Dobriban, Edgar
Language Models (LMs) have shown promising performance in natural language generation. However, as LMs often generate incorrect or hallucinated responses, it is crucial to correctly quantify their uncertainty in responding to given inputs. In additio
Externí odkaz:
http://arxiv.org/abs/2404.03163
Autor:
Khazatsky, Alexander, Pertsch, Karl, Nair, Suraj, Balakrishna, Ashwin, Dasari, Sudeep, Karamcheti, Siddharth, Nasiriany, Soroush, Srirama, Mohan Kumar, Chen, Lawrence Yunliang, Ellis, Kirsty, Fagan, Peter David, Hejna, Joey, Itkina, Masha, Lepert, Marion, Ma, Yecheng Jason, Miller, Patrick Tree, Wu, Jimmy, Belkhale, Suneel, Dass, Shivin, Ha, Huy, Jain, Arhan, Lee, Abraham, Lee, Youngwoon, Memmel, Marius, Park, Sungjae, Radosavovic, Ilija, Wang, Kaiyuan, Zhan, Albert, Black, Kevin, Chi, Cheng, Hatch, Kyle Beltran, Lin, Shan, Lu, Jingpei, Mercat, Jean, Rehman, Abdul, Sanketi, Pannag R, Sharma, Archit, Simpson, Cody, Vuong, Quan, Walke, Homer Rich, Wulfe, Blake, Xiao, Ted, Yang, Jonathan Heewon, Yavary, Arefeh, Zhao, Tony Z., Agia, Christopher, Baijal, Rohan, Castro, Mateo Guaman, Chen, Daphne, Chen, Qiuyu, Chung, Trinity, Drake, Jaimyn, Foster, Ethan Paul, Gao, Jensen, Herrera, David Antonio, Heo, Minho, Hsu, Kyle, Hu, Jiaheng, Jackson, Donovon, Le, Charlotte, Li, Yunshuang, Lin, Kevin, Lin, Roy, Ma, Zehan, Maddukuri, Abhiram, Mirchandani, Suvir, Morton, Daniel, Nguyen, Tony, O'Neill, Abigail, Scalise, Rosario, Seale, Derick, Son, Victor, Tian, Stephen, Tran, Emi, Wang, Andrew E., Wu, Yilin, Xie, Annie, Yang, Jingyun, Yin, Patrick, Zhang, Yunchu, Bastani, Osbert, Berseth, Glen, Bohg, Jeannette, Goldberg, Ken, Gupta, Abhinav, Gupta, Abhishek, Jayaraman, Dinesh, Lim, Joseph J, Malik, Jitendra, Martín-Martín, Roberto, Ramamoorthy, Subramanian, Sadigh, Dorsa, Song, Shuran, Wu, Jiajun, Yip, Michael C., Zhu, Yuke, Kollar, Thomas, Levine, Sergey, Finn, Chelsea
The creation of large, diverse, high-quality robot manipulation datasets is an important stepping stone on the path toward more capable and robust robotic manipulation policies. However, creating such datasets is challenging: collecting robot manipul
Externí odkaz:
http://arxiv.org/abs/2403.12945
Autor:
Yao, Michael S., Zeng, Yimeng, Bastani, Hamsa, Gardner, Jacob, Gee, James C., Bastani, Osbert
Offline model-based policy optimization seeks to optimize a learned surrogate objective function without querying the true oracle objective during optimization. However, inaccurate surrogate model predictions are frequently encountered along the opti
Externí odkaz:
http://arxiv.org/abs/2402.06532