Zobrazeno 1 - 10
of 1 932
pro vyhledávání: '"Anderson, Peter P."'
Financial documents are filled with specialized terminology, arcane jargon, and curious acronyms that pose challenges for general-purpose text embeddings. Yet, few text embeddings specialized for finance have been reported in the literature, perhaps
Externí odkaz:
http://arxiv.org/abs/2411.07142
Text-to-image generation models are powerful but difficult to use. Users craft specific prompts to get better images, though the images can be repetitive. This paper proposes a Prompt Expansion framework that helps users generate high-quality, divers
Externí odkaz:
http://arxiv.org/abs/2312.16720
Autor:
Cho, Jaemin, Hu, Yushi, Garg, Roopal, Anderson, Peter, Krishna, Ranjay, Baldridge, Jason, Bansal, Mohit, Pont-Tuset, Jordi, Wang, Su
Evaluating text-to-image models is notoriously difficult. A strong recent approach for assessing text-image faithfulness is based on QG/A (question generation and answering), which uses pre-trained foundational models to automatically generate a set
Externí odkaz:
http://arxiv.org/abs/2310.18235
Autor:
Wang, Su, Saharia, Chitwan, Montgomery, Ceslee, Pont-Tuset, Jordi, Noy, Shai, Pellegrini, Stefano, Onoe, Yasumasa, Laszlo, Sarah, Fleet, David J., Soricut, Radu, Baldridge, Jason, Norouzi, Mohammad, Anderson, Peter, Chan, William
Text-guided image editing can have a transformative impact in supporting creative applications. A key challenge is to generate edits that are faithful to input text prompts, while consistent with input images. We present Imagen Editor, a cascaded dif
Externí odkaz:
http://arxiv.org/abs/2212.06909
Autor:
Kamath, Aishwarya, Anderson, Peter, Wang, Su, Koh, Jing Yu, Ku, Alexander, Waters, Austin, Yang, Yinfei, Baldridge, Jason, Parekh, Zarana
Recent studies in Vision-and-Language Navigation (VLN) train RL agents to execute natural-language navigation instructions in photorealistic environments, as a step towards robots that can follow human instructions. However, given the scarcity of hum
Externí odkaz:
http://arxiv.org/abs/2210.03112
Autor:
Krantz, Jacob, Banerjee, Shurjo, Zhu, Wang, Corso, Jason, Anderson, Peter, Lee, Stefan, Thomason, Jesse
We present Iterative Vision-and-Language Navigation (IVLN), a paradigm for evaluating language-guided agents navigating in a persistent environment over time. Existing Vision-and-Language Navigation (VLN) benchmarks erase the agent's memory at the be
Externí odkaz:
http://arxiv.org/abs/2210.03087
Autor:
Koh, Jing Yu, Agrawal, Harsh, Batra, Dhruv, Tucker, Richard, Waters, Austin, Lee, Honglak, Yang, Yinfei, Baldridge, Jason, Anderson, Peter
We study the problem of synthesizing immersive 3D indoor scenes from one or more images. Our aim is to generate high-resolution images and videos from novel viewpoints, including viewpoints that extrapolate far beyond the input images while maintaini
Externí odkaz:
http://arxiv.org/abs/2204.02960
Autor:
Wang, Su, Montgomery, Ceslee, Orbay, Jordi, Birodkar, Vighnesh, Faust, Aleksandra, Gur, Izzeddin, Jaques, Natasha, Waters, Austin, Baldridge, Jason, Anderson, Peter
We study the automatic generation of navigation instructions from 360-degree images captured on indoor routes. Existing generators suffer from poor visual grounding, causing them to rely on language priors and hallucinate objects. Our MARKY-MT5 syste
Externí odkaz:
http://arxiv.org/abs/2111.12872
People navigating in unfamiliar buildings take advantage of myriad visual, spatial and semantic cues to efficiently achieve their navigation goals. Towards equipping computational agents with similar capabilities, we introduce Pathdreamer, a visual w
Externí odkaz:
http://arxiv.org/abs/2105.08756
PanGEA, the Panoramic Graph Environment Annotation toolkit, is a lightweight toolkit for collecting speech and text annotations in photo-realistic 3D environments. PanGEA immerses annotators in a web-based simulation and allows them to move around ea
Externí odkaz:
http://arxiv.org/abs/2103.12703