Výsledky vyhledávání - "Kembhavi, Aniruddha"

Report

PoliFormer: Scaling On-Policy RL with Transformers Results in Masterful Navigators

Autor: Zeng, Kuo-Hao, Zhang, Zichen, Ehsani, Kiana, Hendrix, Rose, Salvador, Jordi, Herrasti, Alvaro, Girshick, Ross, Kembhavi, Aniruddha, Weihs, Luca

We present PoliFormer (Policy Transformer), an RGB-only indoor navigation agent trained end-to-end with reinforcement learning at scale that generalizes to the real-world without adaptation despite being trained purely in simulation. PoliFormer uses

Externí odkaz: http://arxiv.org/abs/2406.20083

Zobrazit plný text záznamu

Report

CodeNav: Beyond tool-use to using real-world codebases with LLM agents

Autor: Gupta, Tanmay, Weihs, Luca, Kembhavi, Aniruddha

We present CodeNav, an LLM agent that navigates and leverages previously unseen code repositories to solve user queries. In contrast to tool-use LLM agents that require ``registration'' of all relevant tools via manual descriptions within the LLM con

Externí odkaz: http://arxiv.org/abs/2406.12276

Zobrazit plný text záznamu

Report

Task Me Anything

Autor: Zhang, Jieyu, Huang, Weikai, Ma, Zixian, Michel, Oscar, He, Dong, Gupta, Tanmay, Ma, Wei-Chiu, Farhadi, Ali, Kembhavi, Aniruddha, Krishna, Ranjay

Benchmarks for large multimodal language models (MLMs) now serve to simultaneously assess the general capabilities of models instead of evaluating for a specific capability. As a result, when a developer wants to identify which models to use for thei

Externí odkaz: http://arxiv.org/abs/2406.11775

Zobrazit plný text záznamu

Report

Preserving Identity with Variational Score for General-purpose 3D Editing

Autor: Le, Duong H., Pham, Tuan, Kembhavi, Aniruddha, Mandt, Stephan, Ma, Wei-Chiu, Lu, Jiasen

We present Piva (Preserving Identity with Variational Score Distillation), a novel optimization-based method for editing images and 3D models based on diffusion models. Specifically, our approach is inspired by the recently proposed method for 2D ima

Externí odkaz: http://arxiv.org/abs/2406.08953

Zobrazit plný text záznamu

Report

Iterated Learning Improves Compositionality in Large Vision-Language Models

Autor: Zheng, Chenhao, Zhang, Jieyu, Kembhavi, Aniruddha, Krishna, Ranjay

A fundamental characteristic common to both human vision and natural language is their compositional nature. Yet, despite the performance gains contributed by large vision and language pretraining, recent investigations find that most-if not all-our

Externí odkaz: http://arxiv.org/abs/2404.02145

Zobrazit plný text záznamu

Report

Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action

Autor: Lu, Jiasen, Clark, Christopher, Lee, Sangho, Zhang, Zichen, Khosla, Savya, Marten, Ryan, Hoiem, Derek, Kembhavi, Aniruddha

We present Unified-IO 2, the first autoregressive multimodal model that is capable of understanding and generating image, text, audio, and action. To unify different modalities, we tokenize inputs and outputs -- images, text, audio, action, bounding

Externí odkaz: http://arxiv.org/abs/2312.17172

Zobrazit plný text záznamu

Report

Promptable Behaviors: Personalizing Multi-Objective Rewards from Human Preferences

Autor: Hwang, Minyoung, Weihs, Luca, Park, Chanwoo, Lee, Kimin, Kembhavi, Aniruddha, Ehsani, Kiana

Customizing robotic behaviors to be aligned with diverse human preferences is an underexplored challenge in the field of embodied AI. In this paper, we present Promptable Behaviors, a novel framework that facilitates efficient personalization of robo

Externí odkaz: http://arxiv.org/abs/2312.09337

Zobrazit plný text záznamu

Report

Holodeck: Language Guided Generation of 3D Embodied AI Environments

Autor: Yang, Yue, Sun, Fan-Yun, Weihs, Luca, VanderBilt, Eli, Herrasti, Alvaro, Han, Winson, Wu, Jiajun, Haber, Nick, Krishna, Ranjay, Liu, Lingjie, Callison-Burch, Chris, Yatskar, Mark, Kembhavi, Aniruddha, Clark, Christopher

3D simulated environments play a critical role in Embodied AI, but their creation requires expertise and extensive manual effort, restricting their diversity and scope. To mitigate this limitation, we present Holodeck, a system that generates 3D envi

Externí odkaz: http://arxiv.org/abs/2312.09067

Zobrazit plný text záznamu

Report

Harmonic Mobile Manipulation

Autor: Yang, Ruihan, Kim, Yejin, Kembhavi, Aniruddha, Wang, Xiaolong, Ehsani, Kiana

Recent advancements in robotics have enabled robots to navigate complex scenes or manipulate diverse objects independently. However, robots are still impotent in many household tasks requiring coordinated behaviors such as opening doors. The factoriz

Externí odkaz: http://arxiv.org/abs/2312.06639

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání