Výsledky vyhledávání - "WANG, Mengmeng"

Report

Unbiased General Annotated Dataset Generation

Autor: Jiang, Dengyang, Wang, Haoyu, Zhang, Lei, Wei, Wei, Dai, Guang, Wang, Mengmeng, Wang, Jingdong, Zhang, Yanning

Pre-training backbone networks on a general annotated dataset (e.g., ImageNet) that comprises numerous manually collected images with category annotations has proven to be indispensable for enhancing the generalization capacity of downstream visual t

Externí odkaz: http://arxiv.org/abs/2412.10831

Zobrazit plný text záznamu

Report

Visual Object Tracking across Diverse Data Modalities: A Review

Autor: Wang, Mengmeng, Ma, Teli, Xin, Shuo, Hou, Xiaojun, Xing, Jiazheng, Dai, Guang, Wang, Jingdong, Liu, Yong

Visual Object Tracking (VOT) is an attractive and significant research area in computer vision, which aims to recognize and track specific targets in video sequences where the target objects are arbitrary and class-agnostic. The VOT technology could

Externí odkaz: http://arxiv.org/abs/2412.09991

Zobrazit plný text záznamu

Report

GLOVER: Generalizable Open-Vocabulary Affordance Reasoning for Task-Oriented Grasping

Autor: Ma, Teli, Wang, Zifan, Zhou, Jiaming, Wang, Mengmeng, Liang, Junwei

Inferring affordable (i.e., graspable) parts of arbitrary objects based on human specifications is essential for robots advancing toward open-vocabulary manipulation. Current grasp planners, however, are hindered by limited vision-language comprehens

Externí odkaz: http://arxiv.org/abs/2411.12286

Zobrazit plný text záznamu

Report

Schedule Your Edit: A Simple yet Effective Diffusion Noise Schedule for Image Editing

Autor: Lin, Haonan, Wang, Mengmeng, Wang, Jiahao, An, Wenbin, Chen, Yan, Liu, Yong, Tian, Feng, Dai, Guang, Wang, Jingdong, Wang, Qianying

Text-guided diffusion models have significantly advanced image editing, enabling high-quality and diverse modifications driven by text prompts. However, effective editing requires inverting the source image into a latent space, a process often hinder

Externí odkaz: http://arxiv.org/abs/2410.18756

Zobrazit plný text záznamu

Report

Flipped Classroom: Aligning Teacher Attention with Student in Generalized Category Discovery

Autor: Lin, Haonan, An, Wenbin, Wang, Jiahao, Chen, Yan, Tian, Feng, Wang, Mengmeng, Dai, Guang, Wang, Qianying, Wang, Jingdong

Recent advancements have shown promise in applying traditional Semi-Supervised Learning strategies to the task of Generalized Category Discovery (GCD). Typically, this involves a teacher-student framework in which the teacher imparts knowledge to the

Externí odkaz: http://arxiv.org/abs/2409.19659

Zobrazit plný text záznamu

Report

Enhancing Aspect-based Sentiment Analysis in Tourism Using Large Language Models and Positional Information

Autor: Xu, Chun, Wang, Mengmeng, Ren, Yan, Zhu, Shaolin

Aspect-Based Sentiment Analysis (ABSA) in tourism plays a significant role in understanding tourists' evaluations of specific aspects of attractions, which is crucial for driving innovation and development in the tourism industry. However, traditiona

Externí odkaz: http://arxiv.org/abs/2409.14997

Zobrazit plný text záznamu

Report

SpotActor: Training-Free Layout-Controlled Consistent Image Generation

Autor: Wang, Jiahao, Yan, Caixia, Zhang, Weizhan, Lin, Haonan, Wang, Mengmeng, Dai, Guang, Gong, Tieliang, Sun, Hao, Wang, Jingdong

Text-to-image diffusion models significantly enhance the efficiency of artistic creation with high-fidelity image generation. However, in typical application scenarios like comic book production, they can neither place each subject into its expected

Externí odkaz: http://arxiv.org/abs/2409.04801

Zobrazit plný text záznamu

Report

LV-UNet: A Lightweight and Vanilla Model for Medical Image Segmentation

Autor: Jiang, Juntao, Wang, Mengmeng, Tian, Huizhong, Cheng, Lingbo, Liu, Yong

While large models have achieved significant progress in computer vision, challenges such as optimization complexity, the intricacy of transformer architectures, computational constraints, and practical application demands highlight the importance of

Externí odkaz: http://arxiv.org/abs/2408.16886

Zobrazit plný text záznamu

Report

LangSuitE: Planning, Controlling and Interacting with Large Language Models in Embodied Text Environments

Autor: Jia, Zixia, Wang, Mengmeng, Tong, Baichen, Zhu, Song-Chun, Zheng, Zilong

Recent advances in Large Language Models (LLMs) have shown inspiring achievements in constructing autonomous agents that rely on language descriptions as inputs. However, it remains unclear how well LLMs can function as few-shot or zero-shot embodied

Externí odkaz: http://arxiv.org/abs/2406.16294

Zobrazit plný text záznamu

Report

OneActor: Consistent Character Generation via Cluster-Conditioned Guidance

Autor: Wang, Jiahao, Yan, Caixia, Lin, Haonan, Zhang, Weizhan, Wang, Mengmeng, Gong, Tieliang, Dai, Guang, Sun, Hao

Text-to-image diffusion models benefit artists with high-quality image generation. Yet their stochastic nature hinders artists from creating consistent images of the same subject. Existing methods try to tackle this challenge and generate consistent

Externí odkaz: http://arxiv.org/abs/2404.10267

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání