Výsledky vyhledávání

Report

Language-Informed Beam Search Decoding for Multilingual Machine Translation

Autor: Yang, Yilin, Lee, Stefan, Tadepalli, Prasad

Beam search decoding is the de-facto method for decoding auto-regressive Neural Machine Translation (NMT) models, including multilingual NMT where the target language is specified as an input. However, decoding multilingual NMT models commonly produc

Externí odkaz: http://arxiv.org/abs/2408.05738

Zobrazit plný text záznamu

Report

Point Cloud Models Improve Visual Robustness in Robotic Learners

Autor: Peri, Skand, Lee, Iain, Kim, Chanho, Fuxin, Li, Hermans, Tucker, Lee, Stefan

Visual control policies can encounter significant performance degradation when visual conditions like lighting or camera position differ from those seen during training -- often exhibiting sharp declines in capability even for minor differences. In t

Externí odkaz: http://arxiv.org/abs/2404.18926

Zobrazit plný text záznamu

Report

FairDeDup: Detecting and Mitigating Vision-Language Fairness Disparities in Semantic Dataset Deduplication

Autor: Slyman, Eric, Lee, Stefan, Cohen, Scott, Kafle, Kushal

Recent dataset deduplication techniques have demonstrated that content-aware dataset pruning can dramatically reduce the cost of training Vision-Language Pretrained (VLP) models without significant performance losses compared to training on the origi

Externí odkaz: http://arxiv.org/abs/2404.16123

Zobrazit plný text záznamu

Report

VLSlice: Interactive Vision-and-Language Slice Discovery

Autor: Slyman, Eric, Kahng, Minsuk, Lee, Stefan

Recent work in vision-and-language demonstrates that large-scale pretraining can learn generalizable models that are efficiently transferable to downstream tasks. While this may improve dataset-scale aggregate metrics, analyzing performance around ha

Externí odkaz: http://arxiv.org/abs/2309.06703

Zobrazit plný text záznamu

Report

Behavioral Analysis of Vision-and-Language Navigation Agents

Autor: Yang, Zijiao, Majumdar, Arjun, Lee, Stefan

Publikováno v: In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2574-2582. 2023

To be successful, Vision-and-Language Navigation (VLN) agents must be able to ground instructions to actions based on their surroundings. In this work, we develop a methodology to study agent behavior on a skill-specific basis -- examining how well e

Externí odkaz: http://arxiv.org/abs/2307.10790

Zobrazit plný text záznamu

Report

Navigating to Objects Specified by Images

Autor: Krantz, Jacob, Gervet, Theophile, Yadav, Karmesh, Wang, Austin, Paxton, Chris, Mottaghi, Roozbeh, Batra, Dhruv, Malik, Jitendra, Lee, Stefan, Chaplot, Devendra Singh

Images are a convenient way to specify which particular object instance an embodied agent should navigate to. Solving this task requires semantic visual reasoning and exploration of unknown environments. We present a system that can perform this task

Externí odkaz: http://arxiv.org/abs/2304.01192

Zobrazit plný text záznamu

Report

Emergence of Maps in the Memories of Blind Navigation Agents

Autor: Wijmans, Erik, Savva, Manolis, Essa, Irfan, Lee, Stefan, Morcos, Ari S., Batra, Dhruv

Animal navigation research posits that organisms build and maintain internal spatial representations, or maps, of their environment. We ask if machines -- specifically, artificial intelligence (AI) navigation agents -- also build implicit (or 'mental

Externí odkaz: http://arxiv.org/abs/2301.13261

Zobrazit plný text záznamu

Report

Instance-Specific Image Goal Navigation: Training Embodied Agents to Find Object Instances

Autor: Krantz, Jacob, Lee, Stefan, Malik, Jitendra, Batra, Dhruv, Chaplot, Devendra Singh

We consider the problem of embodied visual navigation given an image-goal (ImageNav) where an agent is initialized in an unfamiliar environment and tasked with navigating to a location 'described' by an image. Unlike related navigation tasks, ImageNa

Externí odkaz: http://arxiv.org/abs/2211.15876

Zobrazit plný text záznamu

Report

Retrospectives on the Embodied AI Workshop

We present a retrospective on the state of Embodied AI research. Our analysis focuses on 13 challenges presented at the Embodied AI Workshop at CVPR. These challenges are grouped into three themes: (1) visual navigation, (2) rearrangement, and (3) em

Externí odkaz: http://arxiv.org/abs/2210.06849

Zobrazit plný text záznamu

Report

Iterative Vision-and-Language Navigation

Autor: Krantz, Jacob, Banerjee, Shurjo, Zhu, Wang, Corso, Jason, Anderson, Peter, Lee, Stefan, Thomason, Jesse

We present Iterative Vision-and-Language Navigation (IVLN), a paradigm for evaluating language-guided agents navigating in a persistent environment over time. Existing Vision-and-Language Navigation (VLN) benchmarks erase the agent's memory at the be

Externí odkaz: http://arxiv.org/abs/2210.03087

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání