Sim2real Learning of Obstacle Avoidance for Robotic Manipulators in Uncertain Environments

Autor:	Hui Huang, Jiatao Lin, Wing-Yue Geoffrey Louie, Tan Zhang, Kefang Zhang
Rok vydání:	2022
Předmět:	Control and Optimization Generalization Computer science business.industry Mechanical Engineering Biomedical Engineering Computer Science Applications Human-Computer Interaction Artificial Intelligence Control and Systems Engineering Minimum bounding box Bounding overwatch Obstacle avoidance Reinforcement learning Robot Computer Vision and Pattern Recognition Artificial intelligence Representation (mathematics) business Robotic arm
Zdroj:	IEEE Robotics and Automation Letters. 7:65-72
ISSN:	2377-3774
DOI:	10.1109/lra.2021.3116700
Popis:	Obstacle avoidance for robotic manipulators can be challenging when they operate in unstructured environments. This problem is probed with the sim-to-real (sim2real) deep reinforcement learning, such that a moving policy of the robotic arm is learnt in a simulator and then adapted to the real world. However, the problem of sim2real adaptation is notoriously difficult. To this end, this work proposes (1) a unified representation of obstacles and targets to capture the underlying dynamics of the environment while allowing generalization to unseen goals and (2) a flexible end-to-end model combining the unified representation with the deep reinforcement learning control module that can be trained by interacting with the environment. Such a representation is agnostic to the shape and appearance of the underlying objects, which simplifies and unifies the scene representation in both simulated and real worlds. We implement this idea with a vision-based actor-critic framework by devising a bounding box predictor module. The predictor estimates the 3D bounding boxes of obstacles and targets from the RGB-D input. The features extracted by the predictor are fed into the policy network, and all the modules are jointly trained. This makes the policy learn object-aware scene representation, which leads to a data-efficient learning of the obstacle avoidance policy. Our experiments in simulated environment and the real-world show that the end-to-end model of the unified representation achieves better sim2real adaption and scene generalization than state-of-the-art techniques.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::bddf24970eb500ea9ff5261859aeded1 https://doi.org/10.1109/lra.2021.3116700 Zobrazit plný text záznamu