Zobrazeno 1 - 10
of 4 655
pro vyhledávání: '"WANG, Mengmeng"'
Autor:
Jiang, Dengyang, Wang, Haoyu, Zhang, Lei, Wei, Wei, Dai, Guang, Wang, Mengmeng, Wang, Jingdong, Zhang, Yanning
Pre-training backbone networks on a general annotated dataset (e.g., ImageNet) that comprises numerous manually collected images with category annotations has proven to be indispensable for enhancing the generalization capacity of downstream visual t
Externí odkaz:
http://arxiv.org/abs/2412.10831
Autor:
Wang, Mengmeng, Ma, Teli, Xin, Shuo, Hou, Xiaojun, Xing, Jiazheng, Dai, Guang, Wang, Jingdong, Liu, Yong
Visual Object Tracking (VOT) is an attractive and significant research area in computer vision, which aims to recognize and track specific targets in video sequences where the target objects are arbitrary and class-agnostic. The VOT technology could
Externí odkaz:
http://arxiv.org/abs/2412.09991
Inferring affordable (i.e., graspable) parts of arbitrary objects based on human specifications is essential for robots advancing toward open-vocabulary manipulation. Current grasp planners, however, are hindered by limited vision-language comprehens
Externí odkaz:
http://arxiv.org/abs/2411.12286
Autor:
Lin, Haonan, Wang, Mengmeng, Wang, Jiahao, An, Wenbin, Chen, Yan, Liu, Yong, Tian, Feng, Dai, Guang, Wang, Jingdong, Wang, Qianying
Text-guided diffusion models have significantly advanced image editing, enabling high-quality and diverse modifications driven by text prompts. However, effective editing requires inverting the source image into a latent space, a process often hinder
Externí odkaz:
http://arxiv.org/abs/2410.18756
Autor:
Lin, Haonan, An, Wenbin, Wang, Jiahao, Chen, Yan, Tian, Feng, Wang, Mengmeng, Dai, Guang, Wang, Qianying, Wang, Jingdong
Recent advancements have shown promise in applying traditional Semi-Supervised Learning strategies to the task of Generalized Category Discovery (GCD). Typically, this involves a teacher-student framework in which the teacher imparts knowledge to the
Externí odkaz:
http://arxiv.org/abs/2409.19659
Aspect-Based Sentiment Analysis (ABSA) in tourism plays a significant role in understanding tourists' evaluations of specific aspects of attractions, which is crucial for driving innovation and development in the tourism industry. However, traditiona
Externí odkaz:
http://arxiv.org/abs/2409.14997
Autor:
Wang, Jiahao, Yan, Caixia, Zhang, Weizhan, Lin, Haonan, Wang, Mengmeng, Dai, Guang, Gong, Tieliang, Sun, Hao, Wang, Jingdong
Text-to-image diffusion models significantly enhance the efficiency of artistic creation with high-fidelity image generation. However, in typical application scenarios like comic book production, they can neither place each subject into its expected
Externí odkaz:
http://arxiv.org/abs/2409.04801
While large models have achieved significant progress in computer vision, challenges such as optimization complexity, the intricacy of transformer architectures, computational constraints, and practical application demands highlight the importance of
Externí odkaz:
http://arxiv.org/abs/2408.16886
Recent advances in Large Language Models (LLMs) have shown inspiring achievements in constructing autonomous agents that rely on language descriptions as inputs. However, it remains unclear how well LLMs can function as few-shot or zero-shot embodied
Externí odkaz:
http://arxiv.org/abs/2406.16294
Autor:
Wang, Jiahao, Yan, Caixia, Lin, Haonan, Zhang, Weizhan, Wang, Mengmeng, Gong, Tieliang, Dai, Guang, Sun, Hao
Text-to-image diffusion models benefit artists with high-quality image generation. Yet their stochastic nature hinders artists from creating consistent images of the same subject. Existing methods try to tackle this challenge and generate consistent
Externí odkaz:
http://arxiv.org/abs/2404.10267