Abstrakt: |
Path planning has a wide range of applications in many fields of engineering. One of the widely used approaches to solve this problem is sampling-based algorithms. However, a significant limitation of these algorithms is their inability to function effectively in the presence of unknown obstacles within the environment. A feasible way of addressing this issue is by using reinforcement learning to interact with the environment. In the context of path planning, safety means choosing safe and non-optimal paths over the optimal ones and having as few collisions as possible during both the training phase and the test phase. In this paper, a novel two-stage algorithm called R3T*-MOSafeRL(λ ) is proposed capable of path planning safely in an environment with unknown dynamic obstacles. In the first stage, the Roadmap Multi-Tree RRT* (R3T*) algorithm is presented which uses an initial map of the environment and an expert's information regarding the important regions of the environment to generate a roadmap. The roadmap acts as discretization of the continuous environment to practically deploy tabular reinforcement learning algorithms on top of it. In the next stage, the algorithm uses a novel eligibility trace-based multi-objective safe reinforcement learning (MOSafeRL(λ )) to perform safe path planning on the roadmap generated by R3T*. Moreover, a heatmap algorithm based on the roadmap and the weights learned by MOSafeRL(λ ) is presented which provides an interpretable method to gain information about the regions of the map with high activity of the unknown dynamic obstacles. Hence, the proposed algorithm provides a powerful method for path planning in unknown dynamic environments. Finally, to illustrate the efficiency of the proposed algorithm and to verify it, some case studies are considered and the computational results are compared and discussed. [ABSTRACT FROM AUTHOR] |