Zobrazeno 1 - 10
of 1 623
pro vyhledávání: '"YANG Wenming"'
Autor:
Kan, Zhehan, Zhang, Ce, Liao, Zihan, Tian, Yapeng, Yang, Wenming, Xiao, Junyuan, Li, Xu, Jiang, Dongmei, Wang, Yaowei, Liao, Qingmin
Large Vision-Language Model (LVLM) systems have demonstrated impressive vision-language reasoning capabilities but suffer from pervasive and severe hallucination issues, posing significant risks in critical domains such as healthcare and autonomous s
Externí odkaz:
http://arxiv.org/abs/2411.12713
Monocular 3D object detection has attracted great attention due to simplicity and low cost. Existing methods typically follow conventional 2D detection paradigms, first locating object centers and then predicting 3D attributes via neighboring feature
Externí odkaz:
http://arxiv.org/abs/2411.02747
Perspective projection has been extensively utilized in monocular 3D object detection methods. It introduces geometric priors from 2D bounding boxes and 3D object dimensions to reduce the uncertainty of depth estimation. However, due to depth errors
Externí odkaz:
http://arxiv.org/abs/2410.19590
The Federal Funds rate in the United States plays a significant role in both domestic and international financial markets. However, research has predominantly focused on the effects of adjustments to the Federal Funds rate rather than on the decision
Externí odkaz:
http://arxiv.org/abs/2410.18012
Real-world image super-resolution (Real-ISR) aims at restoring high-quality (HQ) images from low-quality (LQ) inputs corrupted by unknown and complex degradations. In particular, pretrained text-to-image (T2I) diffusion models provide strong generati
Externí odkaz:
http://arxiv.org/abs/2410.13807
Artificial intelligence aids in brain tumor detection via MRI scans, enhancing the accuracy and reducing the workload of medical professionals. However, in scenarios with extremely limited medical images, traditional deep learning approaches tend to
Externí odkaz:
http://arxiv.org/abs/2410.11307
In real-world scenarios, deep learning models often face challenges from both imbalanced (long-tailed) and out-of-distribution (OOD) data. However, existing joint methods rely on real OOD data, which leads to unnecessary trade-offs. In contrast, our
Externí odkaz:
http://arxiv.org/abs/2410.10548
Character animation is a transformative field in computer graphics and vision, enabling dynamic and realistic video animations from static images. Despite advancements, maintaining appearance consistency in animations remains a challenge. Our approac
Externí odkaz:
http://arxiv.org/abs/2408.16506
The rapid advancement of deepfake technologies has sparked widespread public concern, particularly as face forgery poses a serious threat to public information security. However, the unknown and diverse forgery techniques, varied facial features and
Externí odkaz:
http://arxiv.org/abs/2408.10072
Current state-of-the-art (SOTA) methods in 3D Human Pose Estimation (HPE) are primarily based on Transformers. However, existing Transformer-based 3D HPE backbones often encounter a trade-off between accuracy and computational efficiency. To resolve
Externí odkaz:
http://arxiv.org/abs/2408.02922