Visual question answering with gated relation‐aware auxiliary
Autor: | Xiangjun Shao, Zhenglong Xiang, Yuanxiang Li |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2022 |
Předmět: | |
Zdroj: | IET Image Processing, Vol 16, Iss 5, Pp 1424-1432 (2022) |
Druh dokumentu: | article |
ISSN: | 1751-9667 1751-9659 |
DOI: | 10.1049/ipr2.12421 |
Popis: | Abstract The great advances in computer vision and natural language processing make significant progress in visual question answering. In the visual question answering task, the visual representation is essential for understanding the image content. However, traditional methods rarely exploit the context information of the visual feature related to the question and the relation‐aware information to capture valuable visual representation. Therefore, a gated relation‐aware model is proposed to capture the enhanced visual representation for desiring answer prediction. The gated relation‐aware module can learn relation‐aware information between the visual feature and the context, and a certain object of an image, respectively. In addition, the proposed module can filter out the unnecessary relation‐aware information through the gate guided by the question semantic representation. The results of the conducted experiments show that the gated relation‐aware module makes a significant improvement on all answer categories. |
Databáze: | Directory of Open Access Journals |
Externí odkaz: |