Visual question answering with gated relation‐aware auxiliary

Autor: Xiangjun Shao, Zhenglong Xiang, Yuanxiang Li
Jazyk: angličtina
Rok vydání: 2022
Předmět:
Zdroj: IET Image Processing, Vol 16, Iss 5, Pp 1424-1432 (2022)
Druh dokumentu: article
ISSN: 1751-9667
1751-9659
DOI: 10.1049/ipr2.12421
Popis: Abstract The great advances in computer vision and natural language processing make significant progress in visual question answering. In the visual question answering task, the visual representation is essential for understanding the image content. However, traditional methods rarely exploit the context information of the visual feature related to the question and the relation‐aware information to capture valuable visual representation. Therefore, a gated relation‐aware model is proposed to capture the enhanced visual representation for desiring answer prediction. The gated relation‐aware module can learn relation‐aware information between the visual feature and the context, and a certain object of an image, respectively. In addition, the proposed module can filter out the unnecessary relation‐aware information through the gate guided by the question semantic representation. The results of the conducted experiments show that the gated relation‐aware module makes a significant improvement on all answer categories.
Databáze: Directory of Open Access Journals