Active Perception in Adversarial Scenarios using Maximum Entropy Deep Reinforcement Learning

Autor:	Macheng Shen, Jonathan P. How
Přispěvatelé:	Massachusetts Institute of Technology. Department of Aeronautics and Astronautics, Massachusetts Institute of Technology. Department of Mechanical Engineering
Rok vydání:	2019
Předmět:	FOS: Computer and information sciences Active perception Computer Science - Artificial Intelligence business.industry Computer science Autonomous agent Partially observable Markov decision process 02 engineering and technology 010501 environmental sciences Adversary 01 natural sciences Adversarial system Artificial Intelligence (cs.AI) Action (philosophy) Robustness (computer science) 0202 electrical engineering electronic engineering information engineering Reinforcement learning 020201 artificial intelligence & image processing Artificial intelligence business 0105 earth and related environmental sciences
Zdroj:	ICRA arXiv
DOI:	10.48550/arxiv.1902.05644
Popis:	© 2019 IEEE. We pose an active perception problem where an autonomous agent actively interacts with a second agent with potentially adversarial behaviors. Given the uncertainty in the intent of the other agent, the objective is to collect further evidence to help discriminate potential threats. The main technical challenges are the partial observability of the agent intent, the adversary modeling, and the corresponding uncertainty modeling. Note that an adversary agent may act to mislead the autonomous agent by using a deceptive strategy that is learned from past experiences. We propose an approach that combines belief space planning, generative adversary modeling, and maximum entropy reinforcement learning to obtain a stochastic belief space policy. By accounting for various adversarial behaviors in the simulation framework and minimizing the predictability of the autonomous agent's action, the resulting policy is more robust to unmodeled adversarial strategies. This improved robustness is empirically shown against an adversary that adapts to and exploits the autonomous agent's policy when compared with a standard Chance-Constraint Partially Observable Markov Decision Process robust approach. ARL (Award W911NF-17-2-0181)
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::f209b656eb7a2f5998f734a70fcb662c Zobrazit plný text záznamu