Výsledky vyhledávání

Report

Autor: Kaiser, Robin, Levine, Lionel, Sava-Huss, Ecaterina

Locally Markov walks are natural generalizations of classical Markov chains, where instead of a particle moving independently of the past, it decides where to move next depending on the last action performed at the current location. We introduce the

Externí odkaz: http://arxiv.org/abs/2412.13766

Zobrazit plný text záznamu

Report

Proposer-Agent-Evaluator(PAE): Autonomous Skill Discovery For Foundation Model Internet Agents

Autor: Zhou, Yifei, Yang, Qianlan, Lin, Kaixiang, Bai, Min, Zhou, Xiong, Wang, Yu-Xiong, Levine, Sergey, Li, Erran

The vision of a broadly capable and goal-directed agent, such as an Internet-browsing agent in the digital world and a household humanoid in the physical world, has rapidly advanced, thanks to the generalization capability of foundation models. Such

Externí odkaz: http://arxiv.org/abs/2412.13194

Zobrazit plný text záznamu

Report

RLDG: Robotic Generalist Policy Distillation via Reinforcement Learning

Autor: Xu, Charles, Li, Qiyang, Luo, Jianlan, Levine, Sergey

Recent advances in robotic foundation models have enabled the development of generalist policies that can adapt to diverse tasks. While these models show impressive flexibility, their performance heavily depends on the quality of their training data.

Externí odkaz: http://arxiv.org/abs/2412.09858

Zobrazit plný text záznamu

Report

Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data

Autor: Zhou, Zhiyuan, Peng, Andy, Li, Qiyang, Levine, Sergey, Kumar, Aviral

The modern paradigm in machine learning involves pre-training on diverse data, followed by task-specific fine-tuning. In reinforcement learning (RL), this translates to learning via offline RL on a diverse historical dataset, followed by rapid online

Externí odkaz: http://arxiv.org/abs/2412.07762

Zobrazit plný text záznamu

Report

How to quantify the coherence of a set of beliefs

Autor: Hess, Rowan, Levine, Lionel

Given conflicting probability estimates for a set of events, how can we quantify how much they conflict? How can we find a single probability distribution that best encapsulates the given estimates? One approach is to minimize a loss function such as

Externí odkaz: http://arxiv.org/abs/2412.02777

Zobrazit plný text záznamu

Report

Artificial Expert Intelligence through PAC-reasoning

Autor: Shalev-Shwartz, Shai, Shashua, Amnon, Beniamini, Gal, Levine, Yoav, Sharir, Or, Wies, Noam, Ben-Shaul, Ido, Nussbaum, Tomer, Peled, Shir Granot

Artificial Expert Intelligence (AEI) seeks to transcend the limitations of both Artificial General Intelligence (AGI) and narrow AI by integrating domain-specific expertise with critical, precise reasoning capabilities akin to those of top human expe

Externí odkaz: http://arxiv.org/abs/2412.02441

Zobrazit plný text záznamu

Report

An entropic puzzle in periodic dilaton gravity and DSSYK

Autor: Blommaert, Andreas, Levine, Adam, Mertens, Thomas G., Papalini, Jacopo, Parmentier, Klaas

We study 2d dilaton gravity theories with a periodic potential, with special emphasis on sine dilaton gravity, which is holographically dual to double-scaled SYK. The periodicity of the potentials implies a symmetry under (discrete) shifts in the mom

Externí odkaz: http://arxiv.org/abs/2411.16922

Zobrazit plný text záznamu

Report

Predicting Emergent Capabilities by Finetuning

Autor: Snell, Charlie, Wallace, Eric, Klein, Dan, Levine, Sergey

A fundamental open challenge in modern LLM scaling is the lack of understanding around emergent capabilities. In particular, language model pretraining loss is known to be highly predictable as a function of compute. However, downstream capabilities

Externí odkaz: http://arxiv.org/abs/2411.16035

Zobrazit plný text záznamu

Report

OrigamiPlot: An R Package and Shiny Web App Enhanced Visualizations for Multivariate Data

Autor: Lu, Yiwen, Tong, Jiayi, Lei, Yuqing, Sutton, Alex J., Chu, Haitao, Levine, Lisa D., Lumley, Thomas, Asch, David A., Duan, Rui, Schmid, Christopher H., Chen, Yong

We introduce OrigamiPlot, an open-source R package and Shiny web application designed to enhance the visualization of multivariate data. This package implements the origami plot, a novel visualization technique proposed by Duan et al. in 2023, which

Externí odkaz: http://arxiv.org/abs/2411.12674

Zobrazit plný text záznamu

Report

What Do Learning Dynamics Reveal About Generalization in LLM Reasoning?

Autor: Kang, Katie, Setlur, Amrith, Ghosh, Dibya, Steinhardt, Jacob, Tomlin, Claire, Levine, Sergey, Kumar, Aviral

Despite the remarkable capabilities of modern large language models (LLMs), the mechanisms behind their problem-solving abilities remain elusive. In this work, we aim to better understand how the learning dynamics of LLM finetuning shapes downstream

Externí odkaz: http://arxiv.org/abs/2411.07681

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání