Zobrazeno 1 - 10
of 133
pro vyhledávání: '"Chen, Tianshui"'
Modern visual recognition models often display overconfidence due to their reliance on complex deep neural networks and one-hot target supervision, resulting in unreliable confidence scores that necessitate calibration. While current confidence calib
Externí odkaz:
http://arxiv.org/abs/2407.06844
Recently, researchers have proposed various deep learning methods to accurately detect infrared targets with the characteristics of indistinct shape and texture. Due to the limited variety of infrared datasets, training deep learning models with good
Externí odkaz:
http://arxiv.org/abs/2406.00632
Domain shift poses a significant challenge in Cross-Domain Facial Expression Recognition (CD-FER) due to the distribution variation across different domains. Current works mainly focus on learning domain-invariant features through global feature adap
Externí odkaz:
http://arxiv.org/abs/2401.11085
Autor:
Fu, Hui, Wang, Zeqing, Gong, Ke, Wang, Keze, Chen, Tianshui, Li, Haojie, Zeng, Haifeng, Kang, Wenxiong
Speech-driven 3D facial animation aims to synthesize vivid facial animations that accurately synchronize with speech and match the unique speaking style. However, existing works primarily focus on achieving precise lip synchronization while neglectin
Externí odkaz:
http://arxiv.org/abs/2312.10877
Autor:
Wang, Zhouxia, Yuan, Ziyang, Wang, Xintao, Chen, Tianshui, Xia, Menghan, Luo, Ping, Shan, Ying
Motions in a video primarily consist of camera motion, induced by camera movement, and object motion, resulting from object movement. Accurate control of both camera and object motion is essential for video generation. However, existing works either
Externí odkaz:
http://arxiv.org/abs/2312.03641
The class-agnostic counting (CAC) task has recently been proposed to solve the problem of counting all objects of an arbitrary class with several exemplars given in the input image. To address this challenging task, existing leading methods all resor
Externí odkaz:
http://arxiv.org/abs/2311.10011
Given a descriptive text query, text-based person search (TBPS) aims to retrieve the best-matched target person from an image gallery. Such a cross-modal retrieval task is quite challenging due to significant modality gap, fine-grained differences an
Externí odkaz:
http://arxiv.org/abs/2311.09084
Video scene graph generation (VidSGG) aims to identify objects in visual scenes and infer their relationships for a given video. It requires not only a comprehensive understanding of each object scattered on the whole scene but also a deep dive into
Externí odkaz:
http://arxiv.org/abs/2309.13237
Despite advancements in LLMs, knowledge-based reasoning remains a longstanding issue due to the fragility of knowledge recall and inference. Existing methods primarily encourage LLMs to autonomously plan and solve problems or to extensively sample re
Externí odkaz:
http://arxiv.org/abs/2308.11914
Blind face restoration aims at recovering high-quality face images from those with unknown degradations. Current algorithms mainly introduce priors to complement high-quality details and achieve impressive progress. However, most of these algorithms
Externí odkaz:
http://arxiv.org/abs/2308.07228