Zobrazeno 1 - 10
of 159
pro vyhledávání: '"Lee, Stefan"'
Beam search decoding is the de-facto method for decoding auto-regressive Neural Machine Translation (NMT) models, including multilingual NMT where the target language is specified as an input. However, decoding multilingual NMT models commonly produc
Externí odkaz:
http://arxiv.org/abs/2408.05738
Visual control policies can encounter significant performance degradation when visual conditions like lighting or camera position differ from those seen during training -- often exhibiting sharp declines in capability even for minor differences. In t
Externí odkaz:
http://arxiv.org/abs/2404.18926
Recent dataset deduplication techniques have demonstrated that content-aware dataset pruning can dramatically reduce the cost of training Vision-Language Pretrained (VLP) models without significant performance losses compared to training on the origi
Externí odkaz:
http://arxiv.org/abs/2404.16123
Recent work in vision-and-language demonstrates that large-scale pretraining can learn generalizable models that are efficiently transferable to downstream tasks. While this may improve dataset-scale aggregate metrics, analyzing performance around ha
Externí odkaz:
http://arxiv.org/abs/2309.06703
Publikováno v:
In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2574-2582. 2023
To be successful, Vision-and-Language Navigation (VLN) agents must be able to ground instructions to actions based on their surroundings. In this work, we develop a methodology to study agent behavior on a skill-specific basis -- examining how well e
Externí odkaz:
http://arxiv.org/abs/2307.10790
Autor:
Krantz, Jacob, Gervet, Theophile, Yadav, Karmesh, Wang, Austin, Paxton, Chris, Mottaghi, Roozbeh, Batra, Dhruv, Malik, Jitendra, Lee, Stefan, Chaplot, Devendra Singh
Images are a convenient way to specify which particular object instance an embodied agent should navigate to. Solving this task requires semantic visual reasoning and exploration of unknown environments. We present a system that can perform this task
Externí odkaz:
http://arxiv.org/abs/2304.01192
Animal navigation research posits that organisms build and maintain internal spatial representations, or maps, of their environment. We ask if machines -- specifically, artificial intelligence (AI) navigation agents -- also build implicit (or 'mental
Externí odkaz:
http://arxiv.org/abs/2301.13261
We consider the problem of embodied visual navigation given an image-goal (ImageNav) where an agent is initialized in an unfamiliar environment and tasked with navigating to a location 'described' by an image. Unlike related navigation tasks, ImageNa
Externí odkaz:
http://arxiv.org/abs/2211.15876
Autor:
Deitke, Matt, Batra, Dhruv, Bisk, Yonatan, Campari, Tommaso, Chang, Angel X., Chaplot, Devendra Singh, Chen, Changan, D'Arpino, Claudia Pérez, Ehsani, Kiana, Farhadi, Ali, Fei-Fei, Li, Francis, Anthony, Gan, Chuang, Grauman, Kristen, Hall, David, Han, Winson, Jain, Unnat, Kembhavi, Aniruddha, Krantz, Jacob, Lee, Stefan, Li, Chengshu, Majumder, Sagnik, Maksymets, Oleksandr, Martín-Martín, Roberto, Mottaghi, Roozbeh, Raychaudhuri, Sonia, Roberts, Mike, Savarese, Silvio, Savva, Manolis, Shridhar, Mohit, Sünderhauf, Niko, Szot, Andrew, Talbot, Ben, Tenenbaum, Joshua B., Thomason, Jesse, Toshev, Alexander, Truong, Joanne, Weihs, Luca, Wu, Jiajun
We present a retrospective on the state of Embodied AI research. Our analysis focuses on 13 challenges presented at the Embodied AI Workshop at CVPR. These challenges are grouped into three themes: (1) visual navigation, (2) rearrangement, and (3) em
Externí odkaz:
http://arxiv.org/abs/2210.06849
Autor:
Krantz, Jacob, Banerjee, Shurjo, Zhu, Wang, Corso, Jason, Anderson, Peter, Lee, Stefan, Thomason, Jesse
We present Iterative Vision-and-Language Navigation (IVLN), a paradigm for evaluating language-guided agents navigating in a persistent environment over time. Existing Vision-and-Language Navigation (VLN) benchmarks erase the agent's memory at the be
Externí odkaz:
http://arxiv.org/abs/2210.03087