Popis: |
Network slicing (NwS) is one of the main technologies in the fifth-generation of mobile communication and beyond (5G+). One of the important challenges in the NwS is information uncertainty which mainly involves demand and channel state information (CSI). Demand uncertainty is divided into three types: number of users requests, amount of bandwidth, and requested virtual network functions workloads. Moreover, the CSI uncertainty is modeled by three methods: worst-case, probabilistic, and hybrid. In this paper, our goal is to maximize the utility of the infrastructure provider by exploiting deep reinforcement learning algorithms in end-to-end NwS resource allocation under demand and CSI uncertainties. The proposed formulation is a nonconvex mixed-integer non-linear programming problem. To perform robust resource allocation in problems that involve uncertainty, we need a history of previous information. To this end, we use a recurrent deterministic policy gradient (RDPG) algorithm, a recurrent and memory-based approach in deep reinforcement learning. Then, we compare the RDPG method in different scenarios with soft actor-critic (SAC), deep deterministic policy gradient (DDPG), distributed, and greedy algorithms. The simulation results show that the SAC method is better than the DDPG, distributed, and greedy methods, respectively. Moreover, the RDPG method out performs the SAC approach on average by 70%. |