Popis: |
Multi-robots systems exploiting sensor network capabilities can be successfully employed to cope with several tasks, including coverage, surveillance, target tracking, and foraging, in partially known environments subject to dynamical changes. In this paper we define a reward collection problem, where both the positions and the values of the rewards change with time. We propose a distributed actor-critic method to solve the problem, establish its convergence, and demonstrate its adaptation capabilities. Our analysis leverages ideas from actor-critic methods and consensus algorithms. |