Improving Visual Reasoning Through Semantic Representation

Autor: Wenfeng Zheng, Xiangjun Liu, Xubin Ni, Lirong Yin, Bo Yang
Jazyk: angličtina
Rok vydání: 2021
Předmět:
Zdroj: IEEE Access, Vol 9, Pp 91476-91486 (2021)
Druh dokumentu: article
ISSN: 2169-3536
DOI: 10.1109/ACCESS.2021.3074937
Popis: In visual reasoning, the achievement of deep learning significantly improved the accuracy of results. Image features are primarily used as input to get answers. However, the image features are too redundant to learn accurate characterizations within a limited complexity and time. While in the process of human reasoning, abstract description of an image is usually to avoid irrelevant details. Inspired by this, a higher-level representation named semantic representation is introduced. In this paper, a detailed visual reasoning model is proposed. This new model contains an image understanding model based on semantic representation, feature extraction and process model refined with watershed and u-distance method, a feature vector learning model using pyramidal pooling and residual network, and a question understanding model combining problem embedding coding method and machine translation decoding method. The feature vector could better represent the whole image instead of overly focused on specific characteristics. The model using semantic representation as input verifies that more accurate results can be obtained by introducing a high-level semantic representation. The result also shows that it is feasible and effective to introduce high-level and abstract forms of knowledge representation into deep learning tasks. This study lays a theoretical and experimental foundation for introducing different levels of knowledge representation into deep learning in the future.
Databáze: Directory of Open Access Journals