Zobrazeno 1 - 10
of 1 207
pro vyhledávání: '"Xu Jingyi"'
Visual object counting is a fundamental computer vision task underpinning numerous real-world applications, from cell counting in biomedicine to traffic and wildlife monitoring. However, existing methods struggle to handle the challenge of stacked 3D
Externí odkaz:
http://arxiv.org/abs/2411.19149
Text-to-image diffusion models have demonstrated remarkable capability in generating realistic images from arbitrary text prompts. However, they often produce inconsistent results for compositional prompts such as "two dogs" or "a penguin on the righ
Externí odkaz:
http://arxiv.org/abs/2411.18810
Diffusion models excel at high-quality image and video generation. However, a major drawback is their high latency. A simple yet powerful way to speed them up is by merging similar tokens for faster computation, though this can result in some quality
Externí odkaz:
http://arxiv.org/abs/2411.16720
The task of occupancy forecasting (OCF) involves utilizing past and present perception data to predict future occupancy states of autonomous vehicle surrounding environments, which is critical for downstream tasks such as obstacle avoidance and path
Externí odkaz:
http://arxiv.org/abs/2411.14169
Arctic sea ice performs a vital role in global climate and has paramount impacts on both polar ecosystems and coastal communities. In the last few years, multiple deep learning based pan-Arctic sea ice concentration (SIC) forecasting methods have eme
Externí odkaz:
http://arxiv.org/abs/2410.14732
Autor:
Xu, Jingyi, Tu, Siwei, Yang, Weidong, Li, Shuhao, Liu, Keyi, Luo, Yeqi, Ma, Lipeng, Fei, Ben, Bai, Lei
Variation of Arctic sea ice has significant impacts on polar ecosystems, transporting routes, coastal communities, and global climate. Tracing the change of sea ice at a finer scale is paramount for both operational applications and scientific studie
Externí odkaz:
http://arxiv.org/abs/2410.09111
Place recognition is a crucial module to ensure autonomous vehicles obtain usable localization information in GPS-denied environments. In recent years, multimodal place recognition methods have gained increasing attention due to their ability to over
Externí odkaz:
http://arxiv.org/abs/2410.00299
Human emotional expression is inherently dynamic, complex, and fluid, characterized by smooth transitions in intensity throughout verbal communication. However, the modeling of such intensity fluctuations has been largely overlooked by previous audio
Externí odkaz:
http://arxiv.org/abs/2409.19501
Self-supervised learning of point cloud aims to leverage unlabeled 3D data to learn meaningful representations without reliance on manual annotations. However, current approaches face challenges such as limited data diversity and inadequate augmentat
Externí odkaz:
http://arxiv.org/abs/2409.04963
Understanding human intentions and actions through egocentric videos is important on the path to embodied artificial intelligence. As a branch of egocentric vision techniques, hand trajectory prediction plays a vital role in comprehending human motio
Externí odkaz:
http://arxiv.org/abs/2409.02638