Popis: |
Non-terrestrial networks (NTNs) with low-earth orbit (LEO) satellites have been regarded as promising remedies to support global ubiquitous wireless services. Due to the rapid mobility of LEO satellite, inter-beam/satellite handovers happen frequently for a specific user equipment (UE). To tackle this issue, earth-fixed cell scenarios have been under studied, in which the LEO satellite adjusts its beam direction towards a fixed area within its dwell duration, to maintain stable transmission performance for the UE. Therefore, it is required that the LEO satellite performs real-time resource allocation, which however is unaffordable by the LEO satellite with limited computing capability. To address this issue, in this paper, we propose a two-time-scale collaborative deep reinforcement learning (DRL) scheme for beam management and resource allocation in NTNs, in which LEO satellite and UE with different control cycles update their decision-making policies through a sequential manner. Specifically, UE updates its policy subject to improving the value functions of both the agents. Furthermore, the LEO satellite only makes decisions through finite-step rollouts with a reference decision trajectory received from the UE. Simulation results show that the proposed scheme can effectively balance the throughput performance and computational complexity over traditional greedy-searching schemes. |