Popis: |
We consider the problem of designing a sequential decision making agent to maximize an unknown time-varying function which switches with time. At each step, the agent receives an observation of the function's value at a point decided by the agent. The observation could be corrupted by noise. The agent is also constrained to take safe decisions with high probability, i.e., the chosen points should have a function value greater than a threshold. For this switching environment, we propose a policy called Adaptive-SafeOpt and evaluate its performance via simulations. The policy incorporates Bayesian optimization and change point detection for the safe sequential optimization problem. We observe that a major challenge in adapting to the switching change is to identify safe decisions when the change point is detected and prevent attraction to local optima. |