MarioDAgger: A Time and Space Efficient Autonomous Driver

Autor:	Farzad Kamrani, Andreas Elers, Amir H. Payberah, Mika Cohen
Rok vydání:	2020
Předmět:	Forgetting Training set Computer science business.industry 05 social sciences Supervised learning Retraining Context (language use) 02 engineering and technology Machine learning computer.software_genre 050105 experimental psychology Set (abstract data type) 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing 0501 psychology and cognitive sciences Artificial intelligence Reservoir sampling business computer
Zdroj:	ICMLA
DOI:	10.1109/icmla51294.2020.00230
Popis:	Imitation learning is a promising approach for training autonomous vehicles, where a set of state-action pairs from human demonstrated driving is used as training data in a supervised learning manner. Dataset Aggregation (DAgger) is a common imitation learning algorithm, in which models are trained by iteratively collecting new data, aggregating it with old data, and retraining the model on the entire collected dataset. Data aggregation and retraining, however, lead to two main problems: (i) large memory consumption, and (ii) long training time. In this work, we present a fast and memory-efficient algorithm, called MarioDAgger, that improves DAgger by resolving the aforementioned problems. Unlike DAgger that requires a collection of old and new data to train the models, MarioDAgger uses only the new data and a few samples from the old data stored in a rehearsal buffer, which is updated iteratively using reservoir sampling. To prevent forgetting old knowledge, MarioDAgger uses a recent regularization tech-nique Elastic Weight Consolidation. We evaluate and compare MarioDAgger with SafeDAgger, a recent variant of DAgger that MarioDAgger builds upon, in the context of autonomous vehicles, and show that MarioDAgger achieves the same performance as SafeDAgger in half as many iterations, using significantly less memory space.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::46e78cd88b1bb9e409dcc616504eddd5 https://doi.org/10.1109/icmla51294.2020.00230 Zobrazit plný text záznamu