Výsledky vyhledávání

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Report

Reasoning or a Semblance of it? A Diagnostic Study of Transitive Reasoning in LLMs

Autor: Mehrafarin, Houman, Eshghi, Arash, Konstas, Ioannis

Evaluating Large Language Models (LLMs) on reasoning benchmarks demonstrates their ability to solve compositional questions. However, little is known of whether these models engage in genuine logical reasoning or simply rely on implicit cues to gener

Externí odkaz: http://arxiv.org/abs/2410.20200

Zobrazit plný text záznamu

Report

Repairs in a Block World: A New Benchmark for Handling User Corrections with Multi-Modal Language Models

Autor: Chiyah-Garcia, Javier, Suglia, Alessandro, Eshghi, Arash

In dialogue, the addressee may initially misunderstand the speaker and respond erroneously, often prompting the speaker to correct the misunderstanding in the next turn with a Third Position Repair (TPR). The ability to process and respond appropriat

Externí odkaz: http://arxiv.org/abs/2409.14247

Zobrazit plný text záznamu

Report

Shaking Up VLMs: Comparing Transformers and Structured State Space Models for Vision & Language Modeling

Autor: Pantazopoulos, Georgios, Nikandrou, Malvina, Suglia, Alessandro, Lemon, Oliver, Eshghi, Arash

This study explores replacing Transformers in Visual Language Models (VLMs) with Mamba, a recent structured state space model (SSM) that demonstrates promising performance in sequence modeling. We test models up to 3B parameters under controlled cond

Externí odkaz: http://arxiv.org/abs/2409.05395

Zobrazit plný text záznamu

Report

AlanaVLM: A Multimodal Embodied AI Foundation Model for Egocentric Video Understanding

Autor: Suglia, Alessandro, Greco, Claudio, Baker, Katie, Part, Jose L., Papaioannou, Ioannis, Eshghi, Arash, Konstas, Ioannis, Lemon, Oliver

AI personal assistants deployed via robots or wearables require embodied understanding to collaborate with humans effectively. However, current Vision-Language Models (VLMs) primarily focus on third-person view videos, neglecting the richness of egoc

Externí odkaz: http://arxiv.org/abs/2406.13807

Zobrazit plný text záznamu

Report

Lost in Space: Probing Fine-grained Spatial Understanding in Vision and Language Resamplers

Autor: Pantazopoulos, Georgios, Suglia, Alessandro, Lemon, Oliver, Eshghi, Arash

An effective method for combining frozen large language models (LLM) and visual encoders involves a resampler module that creates a `visual prompt' which is provided to the LLM, along with the textual prompt. While this approach has enabled impressiv

Externí odkaz: http://arxiv.org/abs/2404.13594

Zobrazit plný text záznamu

Report

Multitask Multimodal Prompted Training for Interactive Embodied Task Completion

Autor: Pantazopoulos, Georgios, Nikandrou, Malvina, Parekh, Amit, Hemanthage, Bhathiya, Eshghi, Arash, Konstas, Ioannis, Rieser, Verena, Lemon, Oliver, Suglia, Alessandro

Interactive and embodied tasks pose at least two fundamental challenges to existing Vision & Language (VL) models, including 1) grounding language in trajectories of actions and observations, and 2) referential disambiguation. To tackle these challen

Externí odkaz: http://arxiv.org/abs/2311.04067

Zobrazit plný text záznamu

Akademický článek

Cu/N-doped carbon spheres derived from soybean flour as an active green nanocomposite for the synthesis of 1,4-disubstituted 1H-1,2,3-triazole derivatives

Autor: Mahya Kohansal Moghadam, Hossein Eshghi, Sara S. E. Ghodsinia, Ali Shiri

Publikováno v: Scientific Reports, Vol 14, Iss 1, Pp 1-13 (2024)

Abstract Cu immobilized onto N-doped carbon spheres (Cu/N-doped CS) derived from soybean flour was synthesized via the hydrothermal method and certified as a green high-efficiency catalyst for the regioselective synthesis of 1,4-disubstituted 1H-1,2,

Externí odkaz: https://doaj.org/article/2397c200447a43c0a4777c1d0c6e5e37

Zobrazit plný text záznamu

Plný text ve formátu HTML

Report

Learning to generate and corr- uh I mean repair language in real-time

Autor: Eshghi, Arash, Ashrafzadeh, Arash

In conversation, speakers produce language incrementally, word by word, while continuously monitoring the appropriateness of their own contribution in the dynamically unfolding context of the conversation; and this often leads them to repair their ow

Externí odkaz: http://arxiv.org/abs/2308.11683

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání