AdaCred: Adaptive Causal Decision Transformers with Feature Crediting

Autor:	Kumawat, Hemant, Mukhopadhyay, Saibal
Rok vydání:	2024
Předmět:	Computer Science - Machine Learning Computer Science - Robotics
Druh dokumentu:	Working Paper
Popis:	Reinforcement learning (RL) can be formulated as a sequence modeling problem, where models predict future actions based on historical state-action-reward sequences. Current approaches typically require long trajectory sequences to model the environment in offline RL settings. However, these models tend to over-rely on memorizing long-term representations, which impairs their ability to effectively attribute importance to trajectories and learned representations based on task-specific relevance. In this work, we introduce AdaCred, a novel approach that represents trajectories as causal graphs built from short-term action-reward-state sequences. Our model adaptively learns control policy by crediting and pruning low-importance representations, retaining only those most relevant for the downstream task. Our experiments demonstrate that AdaCred-based policies require shorter trajectory sequences and consistently outperform conventional methods in both offline reinforcement learning and imitation learning environments. Comment: Accepted to 24th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2025)
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2412.15427 Zobrazit plný text záznamu View this record from Arxiv