Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN
Autor: | He Xiao, Chunlin He, Xiaoli He, Hong Jiang, Yu Song |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2019 |
Předmět: |
energy harvesting
Mathematical optimization reinforcement learning General Computer Science Computer science Q-learning Throughput 02 engineering and technology Network topology Q learning law.invention Hop (networking) 0203 mechanical engineering Relay law MDP Reinforcement learning General Materials Science Time complexity multi-hop CRN General Engineering Partially observable Markov decision process 020302 automobile design & engineering Energy consumption Cognitive radio Routing selection lcsh:Electrical engineering. Electronics. Nuclear engineering lcsh:TK1-9971 |
Zdroj: | IEEE Access, Vol 7, Pp 54435-54448 (2019) |
ISSN: | 2169-3536 |
Popis: | This paper considers the routing problem in the communication process of an energy harvesting (EH) multi-hop cognitive radio network (CRN). The transmitter and the relay harvest energy from the environment use it exclusively for transmitting data. In a relay on the path, a limited data buffer is used to store the received data and forward it. We consider a real-world scenario where the EH node has only local causal knowledge, i.e., at any time, each EH node only has knowledge of its own EH process, channel state, and currently received data. An EH routing algorithm based on Q learning in reinforcement learning (RL) for multi-hop CRNs (EHR-QL) is proposed. Our goal is to find an optimal routing policy that can maximize throughput and minimize energy consumption. Through continuous intelligent selection under the partially observable Markov decision process (POMDP), we use the Q learning algorithm in RL with linear function approximation to obtain the optimal path. Compared with the basic Q learning routes, the EHR-QL is superior for longer distances and higher hop counts. The algorithm produces more EH, less energy consumption, and predictable residual energy. In particular, the time complexity of the EHR-QL is analyzed and its convergence is proved. In the simulation experiments, first, we verify the EHR-QL using six EH secondary users (EH-SUs) nodes. Second, the performance (i.e., network lifetime, residual energy, and average throughput) of the EHR-QL is evaluated under the influences of different the learning rates $\alpha $ and discount factors $\gamma $ . Finally, the experimental results show that the EHR-QL obtains a higher throughput, a longer network lifetime, less latency, and lower energy consumption than the basic Q learning routing algorithms. |
Databáze: | OpenAIRE |
Externí odkaz: |