A Reinforcement Learning Approach to Age of Information in Multi-User Networks With HARQ

Autor: András György, Elif Tugce Ceran, Deniz Gunduz
Přispěvatelé: Commission of the European Communities, Engineering & Physical Science Research Council (EPSRC)
Rok vydání: 2021
Předmět:
FOS: Computer and information sciences
Computer Science - Machine Learning
Computer Networks and Communications
Computer science
Computer Science - Information Theory
Automatic repeat request
Hybrid automatic repeat request
0805 Distributed Computing
Data_CODINGANDINFORMATIONTHEORY
02 engineering and technology
Age of information
Multi-user
Constrained Markov decision process
Machine Learning (cs.LG)
Scheduling (computing)
Reinforcement learning
1005 Communications Technologies
0202 electrical engineering
electronic engineering
information engineering

Electrical and Electronic Engineering
business.industry
Information Theory (cs.IT)
Node (networking)
Hybrid automatic repeat request (HARQ)
Whittle index
020206 networking & telecommunications
0906 Electrical and Electronic Engineering
Transmission (telecommunications)
Networking & Telecommunications
business
Computer network
Communication channel
Zdroj: IEEE Journal on Selected Areas in Communications. 39:1412-1426
ISSN: 1558-0008
0733-8716
DOI: 10.1109/jsac.2021.3065057
Popis: Scheduling the transmission of time-sensitive information from a source node to multiple users over error-prone communication channels is studied with the goal of minimizing the long-term average age of information (AoI) at the users. A long-term average resource constraint is imposed on the source, which limits the average number of transmissions. The source can transmit only to a single user at each time slot, and after each transmission, it receives an instantaneous ACK/NACK feedback from the intended receiver, and decides when and to which user to transmit the next update. Assuming the channel statistics are known, the optimal scheduling policy is studied for both the standard automatic repeat request (ARQ) and hybrid ARQ (HARQ) protocols. Then, a reinforcement learning (RL) approach is introduced to find a near-optimal policy, which does not assume any a priori information on the random processes governing the channel states. Different RL methods including average-cost SARSA with linear function approximation (LFA), upper confidence reinforcement learning (UCRL2), and deep Q-network (DQN) are applied and compared through numerical simulations.
Databáze: OpenAIRE