Zobrazeno 1 - 10
of 20
pro vyhledávání: '"Cai, Zhixi"'
We propose Hi-SLAM, a semantic 3D Gaussian Splatting SLAM method featuring a novel hierarchical categorical representation, which enables accurate global 3D semantic mapping, scaling-up capability, and explicit semantic label prediction in the 3D wor
Externí odkaz:
http://arxiv.org/abs/2409.12518
Autor:
Cai, Zhixi, Cardenas, Cristian Rojas, Leo, Kevin, Zhang, Chenyuan, Backman, Kal, Li, Hanbing, Li, Boying, Ghorbanali, Mahsa, Datta, Stavya, Qu, Lizhen, Santiago, Julian Gutierrez, Ignatiev, Alexey, Li, Yuan-Fang, Vered, Mor, Stuckey, Peter J, de la Banda, Maria Garcia, Rezatofighi, Hamid
This paper addresses the problem of autonomous UAV search missions, where a UAV must locate specific Entities of Interest (EOIs) within a time limit, based on brief descriptions in large, hazard-prone environments with keep-out zones. The UAV must pe
Externí odkaz:
http://arxiv.org/abs/2409.10196
With the rapid advancements in multimodal generative technology, Affective Computing research has provoked discussion about the potential consequences of AI systems equipped with emotional intelligence. Affective Computing involves the design, evalua
Externí odkaz:
http://arxiv.org/abs/2409.07256
Autor:
Cai, Zhixi, Dhall, Abhinav, Ghosh, Shreya, Hayat, Munawar, Kollias, Dimitrios, Stefanov, Kalin, Tariq, Usman
The detection and localization of deepfake content, particularly when small fake segments are seamlessly mixed with real videos, remains a significant challenge in the field of digital media security. Based on the recently released AV-Deepfake1M data
Externí odkaz:
http://arxiv.org/abs/2409.06991
Understanding human social behaviour is crucial in computer vision and robotics. Micro-level observations like individual actions fall short, necessitating a comprehensive approach that considers individual behaviour, intra-group dynamics, and social
Externí odkaz:
http://arxiv.org/abs/2404.04458
Autor:
Ke, Fucai, Cai, Zhixi, Jahangard, Simindokht, Wang, Weiqing, Haghighi, Pari Delir, Rezatofighi, Hamid
Recent advances in visual reasoning (VR), particularly with the aid of Large Vision-Language Models (VLMs), show promise but require access to large-scale datasets and face challenges such as high computational costs and limited generalization capabi
Externí odkaz:
http://arxiv.org/abs/2403.12884
Autor:
Cai, Zhixi, Ghosh, Shreya, Adatia, Aman Pankaj, Hayat, Munawar, Dhall, Abhinav, Gedeon, Tom, Stefanov, Kalin
The detection and localization of highly realistic deepfake audio-visual content are challenging even for the most advanced state-of-the-art methods. While most of the research efforts in this domain are focused on detecting high-quality deepfake ima
Externí odkaz:
http://arxiv.org/abs/2311.15308
Autor:
Hasan, Md Rakibul, Ghosh, Shreya, Agrawal, Pradyumna, Cai, Zhixi, Dhall, Abhinav, Gedeon, Tom
This paper proposes a feedback mechanism to change behavioural patterns using the Pavlok device. Pavlok utilises beeps, vibration and shocks as a mode of aversion technique to help individuals with behaviour modification. While the device can be usef
Externí odkaz:
http://arxiv.org/abs/2305.06110
Autor:
Ghosh, Shreya, Cai, Zhixi, Gupta, Parul, Sharma, Garima, Dhall, Abhinav, Hayat, Munawar, Gedeon, Tom
Automatic group emotion recognition plays an important role in understanding complex human-human interaction. This paper introduces, Emolysis, a Python-based, standalone open-source group emotion analysis toolkit for use in different social situation
Externí odkaz:
http://arxiv.org/abs/2305.05255
Most deepfake detection methods focus on detecting spatial and/or spatio-temporal changes in facial attributes and are centered around the binary classification task of detecting whether a video is real or fake. This is because available benchmark da
Externí odkaz:
http://arxiv.org/abs/2305.01979