Sequence Decision Transformer for Adaptive Traffic Signal Control

Autor:	Rui Zhao, Haofeng Hu, Yun Li, Yuze Fan, Fei Gao, Zhenhai Gao
Jazyk:	angličtina
Rok vydání:	2024
Předmět:	adaptive traffic signal control deep reinforcement learning Markov decision process attention mechanism proximal policy optimization Chemical technology TP1-1185
Zdroj:	Sensors, Vol 24, Iss 19, p 6202 (2024)
Druh dokumentu:	article
ISSN:	1424-8220
DOI:	10.3390/s24196202
Popis:	Urban traffic congestion poses significant economic and environmental challenges worldwide. To mitigate these issues, Adaptive Traffic Signal Control (ATSC) has emerged as a promising solution. Recent advancements in deep reinforcement learning (DRL) have further enhanced ATSC’s capabilities. This paper introduces a novel DRL-based ATSC approach named the Sequence Decision Transformer (SDT), employing DRL enhanced with attention mechanisms and leveraging the robust capabilities of sequence decision models, akin to those used in advanced natural language processing, adapted here to tackle the complexities of urban traffic management. Firstly, the ATSC problem is modeled as a Markov Decision Process (MDP), with the observation space, action space, and reward function carefully defined. Subsequently, we propose SDT, specifically tailored to solve the MDP problem. The SDT model uses a transformer-based architecture with an encoder and decoder in an actor–critic structure. The encoder processes observations and outputs, both encoded data for the decoder, and value estimates for parameter updates. The decoder, as the policy network, outputs the agent’s actions. Proximal Policy Optimization (PPO) is used to update the policy network based on historical data, enhancing decision-making in ATSC. This approach significantly reduces training times, effectively manages larger observation spaces, captures dynamic changes in traffic conditions more accurately, and enhances traffic throughput. Finally, the SDT model is trained and evaluated in synthetic scenarios by comparing the number of vehicles, average speed, and queue length against three baselines, including PPO, a DQN tailored for ATSC, and FRAP, a state-of-the-art ATSC algorithm. SDT shows improvements of 26.8%, 150%, and 21.7% over traditional ATSC algorithms, and 18%, 30%, and 15.6% over the FRAP. This research underscores the potential of integrating Large Language Models (LLMs) with DRL for traffic management, offering a promising solution to urban congestion.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/0242fc918ebc429c9c44cec2e222717d Zobrazit plný text záznamu View record in DOAJ Plný text ve formátu PDF Plný text ve formátu HTML
Nepřihlášeným uživatelům se plný text nezobrazuje	K zobrazení výsledku je třeba se přihlásit.