Zobrazeno 1 - 10
of 231 225
pro vyhledávání: '"Hoi An"'
Generating realistic 3D human-object interactions (HOIs) from text descriptions is a active research topic with potential applications in virtual and augmented reality, robotics, and animation. However, creating high-quality 3D HOIs remains challengi
Externí odkaz:
http://arxiv.org/abs/2411.18660
Autor:
Kang, Donggoo, Jeong, Dasol, Lee, Hyunmin, Park, Sangwoo, Park, Hasil, Kwon, Sunkyu, Kim, Yeongjoon, Paik, Joonki
The Large Vision Language Model (VLM) has recently addressed remarkable progress in bridging two fundamental modalities. VLM, trained by a sufficiently large dataset, exhibits a comprehensive understanding of both visual and linguistic to perform div
Externí odkaz:
http://arxiv.org/abs/2411.18038
Detecting Human-Object Interactions (HOI) in zero-shot settings, where models must handle unseen classes, poses significant challenges. Existing methods that rely on aligning visual encoders with large Vision-Language Models (VLMs) to tap into the ex
Externí odkaz:
http://arxiv.org/abs/2410.23904
Autor:
Gao, Jianjun, Cai, Chen, Wang, Ruoyu, Liu, Wenyang, Yap, Kim-Hui, Garg, Kratika, Han, Boon-Siew
Human-object interaction (HOI) detection has seen advancements with Vision Language Models (VLMs), but these methods often depend on extensive manual annotations. Vision Large Language Models (VLLMs) can inherently recognize and reason about interact
Externí odkaz:
http://arxiv.org/abs/2410.15657
Human-object interaction (HOI) and human-scene interaction (HSI) are crucial for human-centric scene understanding applications in Embodied Artificial Intelligence (EAI), robotics, and augmented reality (AR). A common limitation faced in these resear
Externí odkaz:
http://arxiv.org/abs/2410.10782
This paper focuses on Human-Object Interaction (HOI) detection, addressing the challenge of identifying and understanding the interactions between humans and objects within a given image or video frame. Spearheaded by Detection Transformer (DETR), re
Externí odkaz:
http://arxiv.org/abs/2408.07430
Zero-shot human-object interaction (HOI) detector is capable of generalizing to HOI categories even not encountered during training. Inspired by the impressive zero-shot capabilities offered by CLIP, latest methods strive to leverage CLIP embeddings
Externí odkaz:
http://arxiv.org/abs/2408.05974
Autor:
Ai, Chaoyi
Human-Object Interaction (HOI) aims to identify the pairs of humans and objects in images and to recognize their relationships, ultimately forming $\langle human, object, verb \rangle$ triplets. Under default settings, HOI performance is nearly satur
Externí odkaz:
http://arxiv.org/abs/2408.05772
Zero-shot Human-Object Interaction (HOI) detection has emerged as a frontier topic due to its capability to detect HOIs beyond a predefined set of categories. This task entails not only identifying the interactiveness of human-object pairs and locali
Externí odkaz:
http://arxiv.org/abs/2408.02484
Autor:
Jiang-Lin, Jian-Yu, Huang, Kang-Yang, Lo, Ling, Huang, Yi-Ning, Lin, Terence, Wu, Jhih-Ciang, Shuai, Hong-Han, Cheng, Wen-Huang
Diffusion models revolutionize image generation by leveraging natural language to guide the creation of multimedia content. Despite significant advancements in such generative models, challenges persist in depicting detailed human-object interactions
Externí odkaz:
http://arxiv.org/abs/2407.17911