MarioDAgger: A Time and Space Efficient Autonomous Driver
Autor: | Farzad Kamrani, Andreas Elers, Amir H. Payberah, Mika Cohen |
---|---|
Rok vydání: | 2020 |
Předmět: |
Forgetting
Training set Computer science business.industry 05 social sciences Supervised learning Retraining Context (language use) 02 engineering and technology Machine learning computer.software_genre 050105 experimental psychology Set (abstract data type) 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing 0501 psychology and cognitive sciences Artificial intelligence Reservoir sampling business computer |
Zdroj: | ICMLA |
DOI: | 10.1109/icmla51294.2020.00230 |
Popis: | Imitation learning is a promising approach for training autonomous vehicles, where a set of state-action pairs from human demonstrated driving is used as training data in a supervised learning manner. Dataset Aggregation (DAgger) is a common imitation learning algorithm, in which models are trained by iteratively collecting new data, aggregating it with old data, and retraining the model on the entire collected dataset. Data aggregation and retraining, however, lead to two main problems: (i) large memory consumption, and (ii) long training time. In this work, we present a fast and memory-efficient algorithm, called MarioDAgger, that improves DAgger by resolving the aforementioned problems. Unlike DAgger that requires a collection of old and new data to train the models, MarioDAgger uses only the new data and a few samples from the old data stored in a rehearsal buffer, which is updated iteratively using reservoir sampling. To prevent forgetting old knowledge, MarioDAgger uses a recent regularization tech-nique Elastic Weight Consolidation. We evaluate and compare MarioDAgger with SafeDAgger, a recent variant of DAgger that MarioDAgger builds upon, in the context of autonomous vehicles, and show that MarioDAgger achieves the same performance as SafeDAgger in half as many iterations, using significantly less memory space. |
Databáze: | OpenAIRE |
Externí odkaz: |