Robotic Learning From Advisory and Adversarial Interactions Using a Soft Wrist
Autor: | Chisato Nakashima, Kazutoshi Tanaka, Yoshihisa Ijiri, Masashi Hamaya, Yoshiya Shibata, Felix von Drigalski |
---|---|
Rok vydání: | 2021 |
Předmět: |
0209 industrial biotechnology
Control and Optimization Computer science Underactuation Mechanical Engineering Biomedical Engineering Soft robotics 02 engineering and technology 010501 environmental sciences 01 natural sciences Human–robot interaction Computer Science Applications Human-Computer Interaction 020901 industrial engineering & automation Artificial Intelligence Control and Systems Engineering Robustness (computer science) Human–computer interaction Task analysis Reinforcement learning Domain knowledge Robot Computer Vision and Pattern Recognition 0105 earth and related environmental sciences |
Zdroj: | IEEE Robotics and Automation Letters. 6:3878-3885 |
ISSN: | 2377-3774 |
Popis: | In this letter, we developed a novel learning framework from physical human-robot interactions. Owing to human domain knowledge, such interactions can be useful for facilitation of learning. However, applying numerous interactions for training data might place a burden on human users, particularly in real-world applications. To address this problem, we propose formulating this as a model-based reinforcement learning problem to reduce errors during training and increase robustness. Our key idea is to develop 1) an advisory and adversarial interaction strategy and 2) a human-robot interaction model to predict each behavior. In the advisory and adversarial interactions, a human guides and disturbs the robot when it moves in the wrong and correct directions, respectively. Meanwhile, the robot tries to achieve its goal in conjunction with predicting the human's behaviors using the interaction model. To verify the proposed method, we conducted peg-in-hole experiments in a simulation and real-robot environment with human participants and a robot, which has an underactuated soft wrist module. The experimental results showed that our proposed method had smaller position errors during training and a higher number of successes than the baselines without any interactions and with random interactions. |
Databáze: | OpenAIRE |
Externí odkaz: |