Výsledky vyhledávání

Report

EVA: An Embodied World Model for Future Video Anticipation

Autor: Chi, Xiaowei, Zhang, Hengyuan, Fan, Chun-Kai, Qi, Xingqun, Zhang, Rongyu, Chen, Anthony, Chan, Chi-min, Xue, Wei, Luo, Wenhan, Zhang, Shanghang, Guo, Yike

World models integrate raw data from various modalities, such as images and language to simulate comprehensive interactions in the world, thereby displaying crucial roles in fields like mixed reality and robotics. Yet, applying the world model for ac

Externí odkaz: http://arxiv.org/abs/2410.15461

Zobrazit plný text záznamu

Report

Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement

Autor: Wang, Zhi, Zhang, Li, Wu, Wenhao, Zhu, Yuanheng, Zhao, Dongbin, Chen, Chunlin

A longstanding goal of artificial general intelligence is highly capable generalists that can learn from diverse experiences and generalize to unseen tasks. The language and vision communities have seen remarkable progress toward this trend by scalin

Externí odkaz: http://arxiv.org/abs/2410.11448

Zobrazit plný text záznamu

Report

DOME: Taming Diffusion Model into High-Fidelity Controllable Occupancy World Model

Autor: Gu, Songen, Yin, Wei, Jin, Bu, Guo, Xiaoyang, Wang, Junming, Li, Haodong, Zhang, Qian, Long, Xiaoxiao

We propose DOME, a diffusion-based world model that predicts future occupancy frames based on past occupancy observations. The ability of this world model to capture the evolution of the environment is crucial for planning in autonomous driving. Comp

Externí odkaz: http://arxiv.org/abs/2410.10429

Zobrazit plný text záznamu

Report

PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation

Autor: Zhang, Kaidong, Ren, Pengzhen, Lin, Bingqian, Lin, Junfan, Ma, Shikui, Xu, Hang, Liang, Xiaodan

Language-guided robotic manipulation is a challenging task that requires an embodied agent to follow abstract user instructions to accomplish various complex manipulation tasks. Previous work trivially fitting the data without revealing the relation

Externí odkaz: http://arxiv.org/abs/2410.10394

Zobrazit plný text záznamu

Report

WALL-E: World Alignment by Rule Learning Improves World Model-based LLM Agents

Autor: Zhou, Siyu, Zhou, Tianyi, Yang, Yijun, Long, Guodong, Ye, Deheng, Jiang, Jing, Zhang, Chengqi

Can large language models (LLMs) directly serve as powerful world models for model-based agents? While the gaps between the prior knowledge of LLMs and the specified environment's dynamics do exist, our study reveals that the gaps can be bridged by a

Externí odkaz: http://arxiv.org/abs/2410.07484

Zobrazit plný text záznamu

Report

Deliberate Reasoning for LLMs as Structure-aware Planning with Accurate World Model

Autor: Xiong, Siheng, Payani, Ali, Yang, Yuan, Fekri, Faramarz

Enhancing the reasoning capabilities of large language models (LLMs) remains a key challenge, especially for tasks that require complex, multi-step decision-making. Humans excel at these tasks by leveraging deliberate planning with an internal world

Externí odkaz: http://arxiv.org/abs/2410.03136

Zobrazit plný text záznamu

Report

Grounded Answers for Multi-agent Decision-making Problem through Generative World Model

Autor: Liu, Zeyang, Yang, Xinrui, Sun, Shiguang, Qian, Long, Wan, Lipeng, Chen, Xingyu, Lan, Xuguang

Recent progress in generative models has stimulated significant innovations in many fields, such as image generation and chatbots. Despite their success, these models often produce sketchy and misleading solutions for complex multi-agent decision-mak

Externí odkaz: http://arxiv.org/abs/2410.02664

Zobrazit plný text záznamu

Report

World Model-based Perception for Visual Legged Locomotion

Autor: Lai, Hang, Cao, Jiahang, Xu, Jiafeng, Wu, Hongtao, Lin, Yunfeng, Kong, Tao, Yu, Yong, Zhang, Weinan

Legged locomotion over various terrains is challenging and requires precise perception of the robot and its surroundings from both proprioception and vision. However, learning directly from high-dimensional visual input is often data-inefficient and

Externí odkaz: http://arxiv.org/abs/2409.16784

Zobrazit plný text záznamu

Report

Learning Multiple Probabilistic Decisions from Latent World Model in Autonomous Driving

Autor: Xiao, Lingyu, Liu, Jiang-Jiang, Yang, Sen, Li, Xiaofan, Ye, Xiaoqing, Yang, Wankou, Wang, Jingdong

The autoregressive world model exhibits robust generalization capabilities in vectorized scene understanding but encounters difficulties in deriving actions due to insufficient uncertainty modeling and self-delusion. In this paper, we explore the fea

Externí odkaz: http://arxiv.org/abs/2409.15730

Zobrazit plný text záznamu

Report

RenderWorld: World Model with Self-Supervised 3D Label

Autor: Yan, Ziyang, Dong, Wenzhen, Shao, Yihua, Lu, Yuhang, Haiyang, Liu, Liu, Jingwen, Wang, Haozhe, Wang, Zhe, Wang, Yan, Remondino, Fabio, Ma, Yuexin

End-to-end autonomous driving with vision-only is not only more cost-effective compared to LiDAR-vision fusion but also more reliable than traditional methods. To achieve a economical and robust purely visual autonomous driving system, we propose Ren

Externí odkaz: http://arxiv.org/abs/2409.11356

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání