Zobrazeno 1 - 7
of 7
pro vyhledávání: '"Bakr, Eslam Mohamed"'
Autor:
Felemban, Abdulwahab, Bakr, Eslam Mohamed, Shen, Xiaoqian, Ding, Jian, Mohamed, Abduallah, Elhoseiny, Mohamed
We introduce iMotion-LLM: a Multimodal Large Language Models (LLMs) with trajectory prediction, tailored to guide interactive multi-agent scenarios. Different from conventional motion prediction approaches, iMotion-LLM capitalizes on textual instruct
Externí odkaz:
http://arxiv.org/abs/2406.06211
While 3D MLLMs have achieved significant progress, they are restricted to object and scene understanding and struggle to understand 3D spatial structures at the part level. In this paper, we introduce Kestrel, representing a novel approach that empow
Externí odkaz:
http://arxiv.org/abs/2405.18937
Autor:
Bakr, Eslam Mohamed, Sun, Pengzhan, Shen, Xiaoqian, Khan, Faizan Farooq, Li, Li Erran, Elhoseiny, Mohamed
In recent years, Text-to-Image (T2I) models have been extensively studied, especially with the emergence of diffusion models that achieve state-of-the-art results on T2I synthesis tasks. However, existing benchmarks heavily rely on subjective human e
Externí odkaz:
http://arxiv.org/abs/2304.05390
Most pre-trained learning systems are known to suffer from bias, which typically emerges from the data, the model, or both. Measuring and quantifying bias and its sources is a challenging task and has been extensively studied in image captioning. Des
Externí odkaz:
http://arxiv.org/abs/2304.04874
Publikováno v:
NeurIPS 2022
The 3D visual grounding task has been explored with visual and language streams comprehending referential language to identify target objects in 3D scenes. However, most existing methods devote the visual stream to capturing the 3D visual clues using
Externí odkaz:
http://arxiv.org/abs/2211.14241
Recently, attention mechanisms have been explored with ConvNets, both across the spatial and channel dimensions. However, from our knowledge, all the existing methods devote the attention modules to capture local interactions from a uni-scale. In thi
Externí odkaz:
http://arxiv.org/abs/2211.07521
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.