Visual Question Answering Based on Question Attention Model

Autor:	Zhaochang Wu, Huajie Zhang, Yunfang Chen, Jianing Zhang
Rok vydání:	2020
Předmět:	Cognitive science History Question answering Attention model Psychology Computer Science Applications Education
Zdroj:	Journal of Physics: Conference Series. 1624:022022
ISSN:	1742-6596 1742-6588
Popis:	Visual Question Answer (VQA), the natural language question of Visual images, has become popular in the field of artificial intelligence. At present, most of the VQA models extract the whole image features, which consume a large amount of computation and have a complex structure. In this paper, we propose a VQA method based on question attention model. Firstly, the Convolutional Neural Networks (CNN) is used to extract image features from the input images, and the question text is processed by the Long Short-Term Memory (LSTM). Then, we design a question attention module to let the learning algorithm focus on the most relevant features of the input text. According to question features, our method utilities the attention module to add the corresponding weights to the image features and extract the meaningful information for the generation of answer sequence words. Our method performed significantly better than the LSTMQ+I model on the MS COCO visual question answer (VQA) dataset with an accuracy improvement of 2%.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::ec476c27a7b0618aa8caf33ccbee02dc https://doi.org/10.1088/1742-6596/1624/2/022022 Zobrazit plný text záznamu