Zobrazeno 1 - 6
of 6
pro vyhledávání: '"Krojer, Benno"'
Autor:
Krojer, Benno, Vattikonda, Dheeraj, Lara, Luis, Jampani, Varun, Portelance, Eva, Pal, Christopher, Reddy, Siva
An image editing model should be able to perform diverse edits, ranging from object replacement, changing attributes or style, to performing actions or movement, which require many forms of reasoning. Current general instruction-guided editing models
Externí odkaz:
http://arxiv.org/abs/2407.03471
8 years after the visual question answering (VQA) task was proposed, accuracy remains the primary metric for automatic evaluation. VQA Accuracy has been effective so far in the IID evaluation setting. However, our community is undergoing a shift towa
Externí odkaz:
http://arxiv.org/abs/2310.02567
We propose a simple yet effective and robust method for contrastive captioning: generating discriminative captions that distinguish target images from very similar alternative distractor images. Our approach is built on a pragmatic inference procedur
Externí odkaz:
http://arxiv.org/abs/2306.08818
Text-conditioned image generation models have recently shown immense qualitative success using denoising diffusion processes. However, unlike discriminative vision-and-language models, it is a non-trivial task to subject these diffusion-based generat
Externí odkaz:
http://arxiv.org/abs/2305.16397
The ability to integrate context, including perceptual and temporal cues, plays a pivotal role in grounding the meaning of a linguistic utterance. In order to measure to what extent current vision-and-language models master this ability, we devise a
Externí odkaz:
http://arxiv.org/abs/2203.15867
How can pretrained language models (PLMs) learn factual knowledge from the training set? We investigate the two most important mechanisms: reasoning and memorization. Prior work has attempted to quantify the number of facts PLMs learn, but we present
Externí odkaz:
http://arxiv.org/abs/2006.10413