Learning Hierarchical Acquisition Functions for Bayesian Optimization
Autor: | Jan Peters, Nils Rottmann, Elmar Rueckert, Tjasa Kunavar, Jan Babič |
---|---|
Rok vydání: | 2020 |
Předmět: |
Computer science
Process (engineering) media_common.quotation_subject Inference 02 engineering and technology Machine learning computer.software_genre Task (project management) 03 medical and health sciences symbols.namesake 0302 clinical medicine 0202 electrical engineering electronic engineering information engineering Function (engineering) Gaussian process media_common business.industry Bayesian optimization Sampling (statistics) symbols Task analysis 020201 artificial intelligence & image processing Artificial intelligence business computer 030217 neurology & neurosurgery Humanoid robot |
Zdroj: | IROS |
DOI: | 10.1109/iros45743.2020.9341335 |
Popis: | Learning control policies in robotic tasks requires a large number of interactions due to small learning rates, bounds on the updates or unknown constraints. In contrast humans can infer protective and safe solutions after a single failure or unexpected observation. In order to reach similar performance, we developed a hierarchical Bayesian optimization algorithm that replicates the cognitive inference and memorization process for avoiding failures in motor control tasks. A Gaussian Process implements the modeling and the sampling of the acquisition function. This enables rapid learning with large learning rates while a mental replay phase ensures that policy regions that led to failures are inhibited during the sampling process. The features of the hierarchical Bayesian optimization method are evaluated in a simulated and physiological humanoid postural balancing task. The method out- performs standard optimization techniques, such as Bayesian Optimization, in the number of interactions to solve the task, in the computational demands and in the frequency of observed failures. Further, we show that our method performs similar to humans for learning the postural balancing task by comparing our simulation results with real human data. |
Databáze: | OpenAIRE |
Externí odkaz: |