Výsledky vyhledávání - "Dang, Ronghao"

Report

Chain of Ideas: Revolutionizing Research in Novel Idea Development with LLM Agents

Autor: Li, Long, Xu, Weiwen, Guo, Jiayan, Zhao, Ruochen, Li, Xinxuan, Yuan, Yuqian, Zhang, Boqiang, Jiang, Yuming, Xin, Yifei, Dang, Ronghao, Zhao, Deli, Rong, Yu, Feng, Tian, Bing, Lidong

Effective research ideation is a critical step for scientific research. However, the exponential increase in scientific literature makes it challenging for researchers to stay current with recent advances and identify meaningful research directions.

Externí odkaz: http://arxiv.org/abs/2410.13185

Zobrazit plný text záznamu

Report

MoTE: Reconciling Generalization with Specialization for Visual-Language to Video Knowledge Transfer

Autor: Zhu, Minghao, Wang, Zhengpu, Hu, Mengxian, Dang, Ronghao, Lin, Xiao, Zhou, Xun, Liu, Chengju, Chen, Qijun

Transferring visual-language knowledge from large-scale foundation models for video recognition has proved to be effective. To bridge the domain gap, additional parametric modules are added to capture the temporal information. However, zero-shot gene

Externí odkaz: http://arxiv.org/abs/2410.10589

Zobrazit plný text záznamu

Report

Vision-and-Language Navigation via Causal Learning

Autor: Wang, Liuyi, He, Zongtao, Dang, Ronghao, Shen, Mengjiao, Liu, Chengju, Chen, Qijun

In the pursuit of robust and generalizable environment perception and language understanding, the ubiquitous challenge of dataset bias continues to plague vision-and-language navigation (VLN) agents, hindering their performance in unseen environments

Externí odkaz: http://arxiv.org/abs/2404.10241

Zobrazit plný text záznamu

Report

Causality-based Cross-Modal Representation Learning for Vision-and-Language Navigation

Autor: Wang, Liuyi, He, Zongtao, Dang, Ronghao, Chen, Huiyi, Liu, Chengju, Chen, Qijun

Vision-and-Language Navigation (VLN) has gained significant research interest in recent years due to its potential applications in real-world scenarios. However, existing VLN methods struggle with the issue of spurious associations, resulting in poor

Externí odkaz: http://arxiv.org/abs/2403.03405

Zobrazit plný text záznamu

Report

CLIPose: Category-Level Object Pose Estimation with Pre-trained Vision-Language Knowledge

Autor: Lin, Xiao, Zhu, Minghao, Dang, Ronghao, Zhou, Guangliang, Shu, Shaolong, Lin, Feng, Liu, Chengju, Chen, Qijun

Most of existing category-level object pose estimation methods devote to learning the object category information from point cloud modality. However, the scale of 3D datasets is limited due to the high cost of 3D data collection and annotation. Conse

Externí odkaz: http://arxiv.org/abs/2402.15726

Zobrazit plný text záznamu

Report

InstructDET: Diversifying Referring Object Detection with Generalized Instructions

Autor: Dang, Ronghao, Feng, Jiangyan, Zhang, Haodong, Ge, Chongjian, Song, Lin, Gong, Lijun, Liu, Chengju, Chen, Qijun, Zhu, Feng, Zhao, Rui, Song, Yibing

We propose InstructDET, a data-centric method for referring object detection (ROD) that localizes target objects based on user instructions. While deriving from referring expressions (REC), the instructions we leverage are greatly diversified to enco

Externí odkaz: http://arxiv.org/abs/2310.05136

Zobrazit plný text záznamu

Report

Fine-Grained Spatiotemporal Motion Alignment for Contrastive Video Representation Learning

Autor: Zhu, Minghao, Lin, Xiao, Dang, Ronghao, Liu, Chengju, Chen, Qijun

As the most essential property in a video, motion information is critical to a robust and generalized video representation. To inject motion dynamics, recent works have adopted frame difference as the source of motion information in video contrastive

Externí odkaz: http://arxiv.org/abs/2309.00297

Zobrazit plný text záznamu

Report

A Dual Semantic-Aware Recurrent Global-Adaptive Network For Vision-and-Language Navigation

Autor: Wang, Liuyi, He, Zongtao, Tang, Jiagui, Dang, Ronghao, Wang, Naijia, Liu, Chengju, Chen, Qijun

Publikováno v: International Joint Conferences on Artificial Intelligence Organization 2023

Vision-and-Language Navigation (VLN) is a realistic but challenging task that requires an agent to locate the target region using verbal and visual cues. While significant advancements have been achieved recently, there are still two broad limitation

Externí odkaz: http://arxiv.org/abs/2305.03602

Zobrazit plný text záznamu

Report

Multiple Thinking Achieving Meta-Ability Decoupling for Object Navigation

Autor: Dang, Ronghao, Chen, Lu, Wang, Liuyi, He, Zongtao, Liu, Chengju, Chen, Qijun

We propose a meta-ability decoupling (MAD) paradigm, which brings together various object navigation methods in an architecture system, allowing them to mutually enhance each other and evolve together. Based on the MAD paradigm, we design a multiple

Externí odkaz: http://arxiv.org/abs/2302.01520

Zobrazit plný text záznamu

Report

Search for or Navigate to? Dual Adaptive Thinking for Object Navigation

Autor: Dang, Ronghao, Wang, Liuyi, He, Zongtao, Su, Shuai, Liu, Chengju, Chen, Qijun

"Search for" or "Navigate to"? When finding an object, the two choices always come up in our subconscious mind. Before seeing the target, we search for the target based on experience. After seeing the target, we remember the target location and navig

Externí odkaz: http://arxiv.org/abs/2208.00553

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání