Autonomous acquisition of arbitrarily complex skills using locality based graph theoretic features: a syntactic approach to hierarchical reinforcement learning

Autor: Borahan Tumer, Zeynep Kumralbaş, Kutalmış Coşkun, Semiha Hazel Çavuş
Přispěvatelé: Kumralbaş Z., Çavuş S. H., Coşkun K., Tümer B.
Rok vydání: 2023
Předmět:
Dynamic community detection
Control and Optimization
Temporal abstraction
Temel Bilimler (SCI)
Mühendislik
ENGINEERING
AUTOMATION & CONTROL SYSTEMS
Modelleme ve Simülasyon
Skill construction
Kontrol ve Optimizasyon
MATHEMATICS
Information Systems
Communication and Control Engineering

Hierarchical reinforcement learning
Reinforcement learning
Skill coupling
Bilgisayar Bilimleri
Engineering
Computing & Technology (ENG)

Matematik
Bilgisayar Bilimi Uygulamaları
Community detection
Computer Sciences
Temel Bilimler
Mühendislik
Bilişim ve Teknoloji (ENG)

COMPUTER SCIENCE
MATEMATİK
UYGULAMALI

Computer Science Applications
MATHEMATICS
APPLIED

OTOMASYON & KONTROL SİSTEMLERİ
Fizik Bilimleri
Control and Systems Engineering
Modeling and Simulation
Natural Sciences (SCI)
Physical Sciences
Engineering and Technology
Bilgisayar Bilimi
Mühendislik ve Teknoloji
Bilgi Sistemleri
Haberleşme ve Kontrol Mühendisliği

Control and System Engineering
Natural Sciences
Kontrol ve Sistem Mühendisliği
Zdroj: Evolving Systems.
ISSN: 1868-6486
1868-6478
DOI: 10.1007/s12530-022-09478-6
Popis: © 2023, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.With the growing state/action space, learning a satisfactory policy for regular Reinforcement Learning (RL) algorithms such as flat Q-learning becomes quickly infeasible. One possible solution to handle such cases is to employ hierarchical RL (HRL). In this work, we present two methods to autonomously construct (1) skills (ASKA) and (2) arbitrarily elaborate superskills or complexes through defining an arbitrary number of hierarchies in HRL (ASKAC) over a graph-based iteratively-growing environment model. We employ dynamic community detection (DCD) in detecting subgoals since DCD considers local changes only over the partially growing graphs and lowers the time complexity of the subgoal detection where groups of environment states (i.e., subenvironments) are modeled by communities from the graph theory. DCD’s drawback is oversegmentation where it mispartitions a subenvironment further into smaller components. To maintain the robustness of ASKAC against DCD’s possible oversegmentation we introduce the concept of skill coupling. Skill coupling does not only robustly solve the oversegmentation issue, but it also improves HRL by building up more elaborate complexes (i.e., skill compositions) obtained at an arbitrary number of hierarchies and reduces the number of decisions leading to the goal employing these complexes. In addition to the experiments that investigate the effect of parameters, proposed methods are experimentally evaluated in grid world and taxi driver benchmark environments.
Databáze: OpenAIRE