Zobrazeno 1 - 10
of 1 398
pro vyhledávání: '"Visual Question Answering"'
Publikováno v:
Big Data Mining and Analytics, Vol 7, Iss 3, Pp 843-857 (2024)
Previous works employ the Large Language Model (LLM) like GPT-3 for knowledge-based Visual Question Answering (VQA). We argue that the inferential capacity of LLM can be enhanced through knowledge injection. Although methods that utilize knowledge gr
Externí odkaz:
https://doaj.org/article/4a5c02e765b940a88f6ed65e4d31f4d3
Autor:
Heng Zhang, Zhihua Wei, Guanming Liu, Rui Wang, Ruibin Mu, Chuanbao Liu, Aiquan Yuan, Guodong Cao, Ning Hu
Publikováno v:
Virtual Reality & Intelligent Hardware, Vol 6, Iss 4, Pp 280-291 (2024)
Background: External knowledge representations play an essential role in knowledge-based visual question and answering to better understand complex scenarios in the open world. Recent entity-relationship embedding approaches are deficient in represen
Externí odkaz:
https://doaj.org/article/1c531a501ec4475e8e8cb3b13222674e
Publikováno v:
Alexandria Engineering Journal, Vol 99, Iss , Pp 242-256 (2024)
The widespread implementation of surveillance systems on construction sites has led to the accumulation of vast amounts of visual data, highlighting the need for an effective semantic analysis methodology. Natural language, as the most intuitive mode
Externí odkaz:
https://doaj.org/article/220ef99f59d34df7a9a00fd63a1e841a
Publikováno v:
Visual Computing for Industry, Biomedicine, and Art, Vol 7, Iss 1, Pp 1-13 (2024)
Abstract With recent advancements in robotic surgery, notable strides have been made in visual question answering (VQA). Existing VQA systems typically generate textual answers to questions but fail to indicate the location of the relevant content wi
Externí odkaz:
https://doaj.org/article/aacb44afa9224e1da20c704418d07e75
Publikováno v:
Techno.Com, Vol 23, Iss 1, Pp 136-148 (2024)
Indonesia semakin gencar melakukan persiapan transformasi digital dalam berbagai sektor, termasuk dalam bidang pendidikan. Salah satu upaya yang dilakukan pemerintah adalah dengan mengimplementasikan platform e-learning dalam kegiatan belajar mengaja
Externí odkaz:
https://doaj.org/article/cfcdecc4c35b4d8789989c38408784cb
Publikováno v:
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, Vol 17, Pp 14823-14835 (2024)
Remote sensing (RS) visual question answering (VQA) provides accurate answers through the analysis of RS images (RSIs) and associated questions. Recent research has increasingly adopted transformers for feature extraction. However, this trend leads t
Externí odkaz:
https://doaj.org/article/9b2880f8a088446b81905f73107a8f63
Publikováno v:
IEEE Access, Vol 12, Pp 76367-76378 (2024)
Recent advances in visual representation learning allowed for the construction of a plethora of powerful features that are ready to use for numerous downstream tasks. Contrary to existing representation evaluations typically based on image or pixel-w
Externí odkaz:
https://doaj.org/article/297e4dc099694f6285cd3ebc5051520e
Publikováno v:
SoftwareX, Vol 26, Iss , Pp 101731- (2024)
ConfigILM is an open-source Python library for rapid iterative development of image-language models for visual question answering in PyTorch. It provides a convenient implementation for seamlessly combining image and language models from two popular
Externí odkaz:
https://doaj.org/article/b331d46c8b80461e845b76125c5673f8
Autor:
Hilmi Demirhan, Wlodek Zadrozny
Publikováno v:
BioMedInformatics, Vol 4, Iss 1, Pp 50-74 (2023)
Multimodal medical question answering (MMQA) is a vital area bridging healthcare and Artificial Intelligence (AI). This survey methodically examines the MMQA research published in recent years. We collect academic literature through Google Scholar, a
Externí odkaz:
https://doaj.org/article/c4308bd755224cb8889226ff316dee55
Autor:
Jiangfan Feng, Hui Wang
Publikováno v:
International Journal of Applied Earth Observations and Geoinformation, Vol 126, Iss , Pp 103641- (2024)
Remote sensing visual question answering (RSVQA) is a user-friendly method used for analyzing remote sensing images (RSIs) in various tasks. However, current methods often overlook geospatial objects, which possess a multi-scale representation and re
Externí odkaz:
https://doaj.org/article/ce6b3aef0b2941349cc55c9abaf68a3d