Abstrakt: |
Unmanned aerial vehicles (UAVs) assisted Internet of things (IoT) data collection is an efficient and promising approach. The optimization of resource allocation in path planning is addressed in this paper by refining the energy consumption model and considering three metrics: the amount of collected data, time efficiency, and energy efficiency. The problem is formulated as a distributed partially observable Markov decision process (POMDP) and a novel deep reinforcement learning algorithm called RISE (Rényi state entropy)-D3QN (dueling double deep Q network) is proposed. It combines intrinsic rewards, prioritized experience replay, and soft-max exploration strategy, enabling path planning for UAV swarms while adapting to changes in UAV battery capacity, IoT device locations, data volume, and quantity. Simulation results demonstrate that compared to traditional D3QN and DQN algorithms, the proposed approach significantly increases. the amount of collected data from IoT devices while reducing UAV flight time and energy consumption, all while ensuring UAV safety during flight. [ABSTRACT FROM AUTHOR] |