Výsledky vyhledávání

Report

African or European Swallow? Benchmarking Large Vision-Language Models for Fine-Grained Object Classification

Autor: Geigle, Gregor, Timofte, Radu, Glavaš, Goran

Recent Large Vision-Language Models (LVLMs) demonstrate impressive abilities on numerous image understanding and reasoning tasks. The task of fine-grained object classification (e.g., distinction between \textit{animal species}), however, has been pr

Externí odkaz: http://arxiv.org/abs/2406.14496

Zobrazit plný text záznamu

Report

Does Object Grounding Really Reduce Hallucination of Large Vision-Language Models?

Autor: Geigle, Gregor, Timofte, Radu, Glavaš, Goran

Large vision-language models (LVLMs) have recently dramatically pushed the state of the art in image captioning and many image understanding tasks (e.g., visual question answering). LVLMs, however, often \textit{hallucinate} and produce captions that

Externí odkaz: http://arxiv.org/abs/2406.14492

Zobrazit plný text záznamu

Report

InstructIR: High-Quality Image Restoration Following Human Instructions

Autor: Conde, Marcos V., Geigle, Gregor, Timofte, Radu

Image restoration is a fundamental problem that involves recovering a high-quality clean image from its degraded observation. All-In-One image restoration models can effectively restore images from various types and levels of degradation using degrad

Externí odkaz: http://arxiv.org/abs/2401.16468

Zobrazit plný text záznamu

Report

mBLIP: Efficient Bootstrapping of Multilingual Vision-LLMs

Autor: Geigle, Gregor, Jain, Abhay, Timofte, Radu, Glavaš, Goran

Modular vision-language models (Vision-LLMs) align pretrained image encoders with (frozen) large language models (LLMs) and post-hoc condition LLMs to `understand' the image input. With the abundance of readily available high-quality English image-te

Externí odkaz: http://arxiv.org/abs/2307.06930

Zobrazit plný text záznamu

Report

Babel-ImageNet: Massively Multilingual Evaluation of Vision-and-Language Representations

Autor: Geigle, Gregor, Timofte, Radu, Glavaš, Goran

Vision-and-language (VL) models with separate encoders for each modality (e.g., CLIP) have become the go-to models for zero-shot image classification and image-text retrieval. They are, however, mostly evaluated in English as multilingual benchmarks

Externí odkaz: http://arxiv.org/abs/2306.08658

Zobrazit plný text záznamu

Report

One does not fit all! On the Complementarity of Vision Encoders for Vision and Language Tasks

Autor: Geigle, Gregor, Liu, Chen Cecilia, Pfeiffer, Jonas, Gurevych, Iryna

Current multimodal models, aimed at solving Vision and Language (V+L) tasks, predominantly repurpose Vision Encoders (VE) as feature extractors. While many VEs -- of different architectures, trained on different data and objectives -- are publicly av

Externí odkaz: http://arxiv.org/abs/2210.06379

Zobrazit plný text záznamu

Report

UKP-SQUARE: An Online Platform for Question Answering Research

Autor: Baumgärtner, Tim, Wang, Kexin, Sachdeva, Rachneet, Eichler, Max, Geigle, Gregor, Poth, Clifton, Sterz, Hannah, Puerto, Haritz, Ribeiro, Leonardo F. R., Pfeiffer, Jonas, Reimers, Nils, Şahin, Gözde Gül, Gurevych, Iryna

Recent advances in NLP and information retrieval have given rise to a diverse set of question answering tasks that are of different formats (e.g., extractive, abstractive), require different model architectures (e.g., generative, discriminative), and

Externí odkaz: http://arxiv.org/abs/2203.13693

Zobrazit plný text záznamu

Akademický článek

Systemic Inflammatory Changes in Spinal Cord Injured Patients after Adding Aquatic Therapy to Standard Physiotherapy Treatment

Autor: María. Teresa Agulló-Ortuño, Helena Romay-Barrero, Johan Lambeck, Juan M. Blanco-Calonge, Rubén Arroyo-Fernández, Paula Richley Geigle, Raquel Menchero, Gonzalo Melgar del Corral, Inés Martínez-Galán

Publikováno v: International Journal of Molecular Sciences, Vol 25, Iss 14, p 7961 (2024)

Spinal cord injury (SCI) is a severe medical condition resulting in substantial physiological and functional consequences for the individual. People with SCI are characterised by a chronic, low-grade systemic inflammatory state, which contributes to

Externí odkaz: https://doaj.org/article/e52c7f30fb744d2082d27f68eacf382f

Zobrazit plný text záznamu

Plný text ve formátu HTML

Report

xGQA: Cross-Lingual Visual Question Answering

Autor: Pfeiffer, Jonas, Geigle, Gregor, Kamath, Aishwarya, Steitz, Jan-Martin O., Roth, Stefan, Vulić, Ivan, Gurevych, Iryna

Recent advances in multimodal vision and language modeling have predominantly focused on the English language, mostly due to the lack of multilingual multimodal datasets to steer modeling efforts. In this work, we address this gap and provide xGQA, a

Externí odkaz: http://arxiv.org/abs/2109.06082

Zobrazit plný text záznamu

Report

TWEAC: Transformer with Extendable QA Agent Classifiers

Autor: Geigle, Gregor, Reimers, Nils, Rücklé, Andreas, Gurevych, Iryna

Question answering systems should help users to access knowledge on a broad range of topics and to answer a wide array of different questions. Most systems fall short of this expectation as they are only specialized in one particular setting, e.g., a

Externí odkaz: http://arxiv.org/abs/2104.07081

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání