AQuA: ASP-Based Visual Question Answering

Autor:	Farhad Shakerin, Gopal Gupta, Kinjal Basu
Rok vydání:	2020
Předmět:	Artificial neural network Commonsense knowledge Computer science business.industry Natural language understanding Commonsense reasoning computer.software_genre Set (abstract data type) Answer set programming Question answering Artificial intelligence business computer Natural language processing Natural language
Zdroj:	Practical Aspects of Declarative Languages ISBN: 9783030391966 PADL
DOI:	10.1007/978-3-030-39197-3_4
Popis:	AQuA (ASP-based Question Answering) is an Answer Set Programming (ASP) based visual question answering framework that truly “understands” an input picture and answers natural language questions about that picture. The knowledge contained in the picture is extracted using YOLO, a neural network-based object detection technique, and represented as an answer set program. Natural language processing is performed on the question to transform it into an ASP query. Semantic relations are extracted in the process for deeper understanding and to answer more complex questions. The resulting knowledge-base—with additional commonsense knowledge imported—can be used to perform reasoning using an ASP system, allowing it to answer questions about the picture, just like a human. This framework achieves 93.7% accuracy on CLEVR dataset, which exceeds human baseline performance. What is significant is that AQuA translates a question into an ASP query without requiring any training. Our framework for Visual Question Answering is quite general and closely simulates the way humans operate. In contrast to existing purely machine learning-based methods, our framework provides an explanation for the answer it computes, while maintaining high accuracy.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::020af91f6b2c3e621a018a9de58e6468 https://doi.org/10.1007/978-3-030-39197-3_4 Zobrazit plný text záznamu