AQuA: ASP-Based Visual Question Answering
Autor: | Farhad Shakerin, Gopal Gupta, Kinjal Basu |
---|---|
Rok vydání: | 2020 |
Předmět: |
Artificial neural network
Commonsense knowledge Computer science business.industry Natural language understanding Commonsense reasoning computer.software_genre Set (abstract data type) Answer set programming Question answering Artificial intelligence business computer Natural language processing Natural language |
Zdroj: | Practical Aspects of Declarative Languages ISBN: 9783030391966 PADL |
DOI: | 10.1007/978-3-030-39197-3_4 |
Popis: | AQuA (ASP-based Question Answering) is an Answer Set Programming (ASP) based visual question answering framework that truly “understands” an input picture and answers natural language questions about that picture. The knowledge contained in the picture is extracted using YOLO, a neural network-based object detection technique, and represented as an answer set program. Natural language processing is performed on the question to transform it into an ASP query. Semantic relations are extracted in the process for deeper understanding and to answer more complex questions. The resulting knowledge-base—with additional commonsense knowledge imported—can be used to perform reasoning using an ASP system, allowing it to answer questions about the picture, just like a human. This framework achieves 93.7% accuracy on CLEVR dataset, which exceeds human baseline performance. What is significant is that AQuA translates a question into an ASP query without requiring any training. Our framework for Visual Question Answering is quite general and closely simulates the way humans operate. In contrast to existing purely machine learning-based methods, our framework provides an explanation for the answer it computes, while maintaining high accuracy. |
Databáze: | OpenAIRE |
Externí odkaz: |