Mapping Instructions to Actions in 3D Environments with Visual Goal Prediction

Autor:	Max Shatkhin, Valts Blukis, Eyvind Niklasson, Dipendra Misra, Andrew Bennett, Yoav Artzi
Rok vydání:	2018
Předmět:	FOS: Computer and information sciences Computer Science - Computation and Language Computer science business.industry 02 engineering and technology Machine learning computer.software_genre Task (project management) 03 medical and health sciences 0302 clinical medicine Action (philosophy) 030221 ophthalmology & optometry 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Artificial intelligence business computer Computation and Language (cs.CL)
Zdroj:	EMNLP
DOI:	10.48550/arxiv.1809.00786
Popis:	We propose to decompose instruction execution to goal prediction and action generation. We design a model that maps raw visual observations to goals using LINGUNET, a language-conditioned image generation network, and then generates the actions required to complete them. Our model is trained from demonstration only without external resources. To evaluate our approach, we introduce two benchmarks for instruction following: LANI, a navigation task; and CHAI, where an agent executes household instructions. Our evaluation demonstrates the advantages of our model decomposition, and illustrates the challenges posed by our new benchmarks. Comment: Accepted at EMNLP 2018
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::08ad702b6a9ac5515f8148b1e21cf456 Zobrazit plný text záznamu