Popis: |
Reactive stepping and push recovery for biped robots is often restricted to flat terrains because of the difficulty in computing capture regions for nonlinear dynamic models. In this paper, we address this limitation by using reinforcement learning to approximately learn the 3D capture region for such systems. We propose a novel 3D reactive stepper, The DeepQ stepper, that computes optimal step locations for walking at different velocities using the 3D capture regions approximated by the action-value function. We demonstrate the ability of the approach to learn stepping with a simplified 3D pendulum model and a full robot dynamics. Further, the stepper achieves a higher performance when it learns approximate capture regions while taking into account the entire dynamics of the robot that are often ignored in existing reactive steppers based on simplified models. The DeepQ stepper can handle non convex terrain with obstacles, walk on restricted surfaces like stepping stones and recover from external disturbances for a constant computational cost. |