Zobrazeno 1 - 10
of 192
pro vyhledávání: '"Cheng, Harry"'
Multi-modal Large Language Models (MLLMs) have advanced significantly, offering powerful vision-language understanding capabilities. However, these models often inherit severe social biases from their training datasets, leading to unfair predictions
Externí odkaz:
http://arxiv.org/abs/2408.06569
Autor:
Cheng, Yi, Xu, Ziwei, Lin, Dongyun, Cheng, Harry, Wong, Yongkang, Sun, Ying, Lim, Joo Hwee, Kankanhalli, Mohan
For visual content generation, discrepancies between user intentions and the generated content have been a longstanding problem. This discrepancy arises from two main factors. First, user intentions are inherently complex, with subtle details not ful
Externí odkaz:
http://arxiv.org/abs/2405.12538
Autor:
Guan, Tianrui, Yang, Yurou, Cheng, Harry, Lin, Muyuan, Kim, Richard, Madhivanan, Rajasimman, Sen, Arnie, Manocha, Dinesh
In this paper, we present LOC-ZSON, a novel Language-driven Object-Centric image representation for object navigation task within complex scenes. We propose an object-centric image representation and corresponding losses for visual-language model (VL
Externí odkaz:
http://arxiv.org/abs/2405.05363
Detecting diffusion-generated images has recently grown into an emerging research area. Existing diffusion-based datasets predominantly focus on general image generation. However, facial forgeries, which pose a more severe social risk, have remained
Externí odkaz:
http://arxiv.org/abs/2401.15859
Notwithstanding offering convenience and entertainment to society, Deepfake face swapping has caused critical privacy issues with the rapid development of deep generative models. Due to imperceptible artifacts in high-quality synthetic images, passiv
Externí odkaz:
http://arxiv.org/abs/2311.01357
Training an effective video action recognition model poses significant computational challenges, particularly under limited resource budgets. Current methods primarily aim to either reduce model size or utilize pre-trained models, limiting their adap
Externí odkaz:
http://arxiv.org/abs/2307.14866
The existing deepfake detection methods have reached a bottleneck in generalizing to unseen forgeries and manipulation approaches. Based on the observation that the deepfake detectors exhibit a preference for overfitting the specific primary regions
Externí odkaz:
http://arxiv.org/abs/2307.12534
Recently, Deepfake has drawn considerable public attention due to security and privacy concerns in social media digital forensics. As the wildly spreading Deepfake videos on the Internet become more realistic, traditional detection techniques have fa
Externí odkaz:
http://arxiv.org/abs/2209.05299
Detecting forgery videos is highly desirable due to the abuse of deepfake. Existing detection approaches contribute to exploring the specific artifacts in deepfake videos and fit well on certain data. However, the growing technique on these artifacts
Externí odkaz:
http://arxiv.org/abs/2203.02195