Zobrazeno 1 - 10
of 1 343 120
pro vyhledávání: '"P. Head"'
Deciding which large language model (LLM) to use is a complex challenge. Pairwise ranking has emerged as a new method for evaluating human preferences for LLMs. This approach entails humans evaluating pairs of model outputs based on a predefined crit
Externí odkaz:
http://arxiv.org/abs/2411.14483
Traditional knowledge distillation focuses on aligning the student's predicted probabilities with both ground-truth labels and the teacher's predicted probabilities. However, the transition to predicted probabilities from logits would obscure certain
Externí odkaz:
http://arxiv.org/abs/2411.08937
Generating animatable and editable 3D head avatars is essential for various applications in computer vision and graphics. Traditional 3D-aware generative adversarial networks (GANs), often using implicit fields like Neural Radiance Fields (NeRF), ach
Externí odkaz:
http://arxiv.org/abs/2412.19149
Autor:
Ali, Usman, Ranmbail, Sahil, Nadeem, Muhammad, Ishfaq, Hamid, Ramzan, Muhammad Umer, Ali, Waqas
Extracting medication names from handwritten doctor prescriptions is challenging due to the wide variability in handwriting styles and prescription formats. This paper presents a robust method for extracting medicine names using a combination of Mask
Externí odkaz:
http://arxiv.org/abs/2412.18199
We present FaceLift, a feed-forward approach for rapid, high-quality, 360-degree head reconstruction from a single image. Our pipeline begins by employing a multi-view latent diffusion model that generates consistent side and back views of the head f
Externí odkaz:
http://arxiv.org/abs/2412.17812
Head and neck tumors and metastatic lymph nodes are crucial for treatment planning and prognostic analysis. Accurate segmentation and quantitative analysis of these structures require pixel-level annotation, making automated segmentation techniques e
Externí odkaz:
http://arxiv.org/abs/2412.14846
Skeleton-based action recognition using GCNs has achieved remarkable performance, but recognizing ambiguous actions, such as "waving" and "saluting", remains a significant challenge. Existing methods typically rely on a serial combination of GCNs and
Externí odkaz:
http://arxiv.org/abs/2412.14833
Rendering photorealistic head avatars from arbitrary viewpoints is crucial for various applications like virtual reality. Although previous methods based on Neural Radiance Fields (NeRF) can achieve impressive results, they lack fidelity and efficien
Externí odkaz:
http://arxiv.org/abs/2412.13983
Autor:
He, Jinghan, Zhu, Kuan, Guo, Haiyun, Fang, Junfeng, Hua, Zhenglin, Jia, Yuheng, Tang, Ming, Chua, Tat-Seng, Wang, Jinqiao
Large vision-language models (LVLMs) have made substantial progress in integrating large language models (LLMs) with visual inputs, enabling advanced multimodal reasoning. Despite their success, a persistent challenge is hallucination-where generated
Externí odkaz:
http://arxiv.org/abs/2412.13949
Real-world datasets often exhibit a long-tailed distribution, where vast majority of classes known as tail classes have only few samples. Traditional methods tend to overfit on these tail classes. Recently, a new approach called Imbalanced SAM (ImbSA
Externí odkaz:
http://arxiv.org/abs/2412.13715