Zobrazeno 1 - 10
of 1 651
pro vyhledávání: '"Konstas, A"'
Evaluating Large Language Models (LLMs) on reasoning benchmarks demonstrates their ability to solve compositional questions. However, little is known of whether these models engage in genuine logical reasoning or simply rely on implicit cues to gener
Externí odkaz:
http://arxiv.org/abs/2410.20200
Autor:
Nikandrou, Malvina, Pantazopoulos, Georgios, Vitsakis, Nikolas, Konstas, Ioannis, Suglia, Alessandro
As Vision and Language models (VLMs) become accessible across the globe, it is important that they demonstrate cultural knowledge. In this paper, we introduce CROPE, a visual question answering benchmark designed to probe the knowledge of culture-spe
Externí odkaz:
http://arxiv.org/abs/2410.15453
Gender-Based Violence (GBV) is an increasing problem online, but existing datasets fail to capture the plurality of possible annotator perspectives or ensure the representation of affected groups. We revisit two important stages in the moderation pip
Externí odkaz:
http://arxiv.org/abs/2410.03543
Language models have been shown to reproduce underlying biases existing in their training data, which is the majority perspective by default. Proposed solutions aim to capture minority perspectives by either modelling annotator disagreements or group
Externí odkaz:
http://arxiv.org/abs/2407.14259
Evaluating the generalisation capabilities of multimodal models based solely on their performance on out-of-distribution data fails to capture their true robustness. This work introduces a comprehensive evaluation framework that systematically examin
Externí odkaz:
http://arxiv.org/abs/2407.03967
Continual learning focuses on incrementally training a model on a sequence of tasks with the aim of learning new tasks while minimizing performance drop on previous tasks. Existing approaches at the intersection of Continual Learning and Visual Quest
Externí odkaz:
http://arxiv.org/abs/2406.19297
Autor:
Suglia, Alessandro, Greco, Claudio, Baker, Katie, Part, Jose L., Papaioannou, Ioannis, Eshghi, Arash, Konstas, Ioannis, Lemon, Oliver
AI personal assistants deployed via robots or wearables require embodied understanding to collaborate with humans effectively. However, current Vision-Language Models (VLMs) primarily focus on third-person view videos, neglecting the richness of egoc
Externí odkaz:
http://arxiv.org/abs/2406.13807
In recent years, several machine learning models have been proposed. They are trained with a language modelling objective on large-scale text-only data. With such pretraining, they can achieve impressive results on many Natural Language Understanding
Externí odkaz:
http://arxiv.org/abs/2312.02431
Autor:
Pantazopoulos, Georgios, Nikandrou, Malvina, Parekh, Amit, Hemanthage, Bhathiya, Eshghi, Arash, Konstas, Ioannis, Rieser, Verena, Lemon, Oliver, Suglia, Alessandro
Interactive and embodied tasks pose at least two fundamental challenges to existing Vision & Language (VL) models, including 1) grounding language in trajectories of actions and observations, and 2) referential disambiguation. To tackle these challen
Externí odkaz:
http://arxiv.org/abs/2311.04067
The ability to handle miscommunication is crucial to robust and faithful conversational AI. People usually deal with miscommunication immediately as they detect it, using highly systematic interactional mechanisms called repair. One important type of
Externí odkaz:
http://arxiv.org/abs/2307.16689