Zobrazeno 1 - 10
of 15
pro vyhledávání: '"Thao Minh Le"'
Publikováno v:
VNU Journal of Science: Computer Science and Communication Engineering. 38
This paper presents VieCap4H, a grand data challenge on automatic image caption generation for the healthcare domain in Vietnamese. VieCap4H is held as part of the eighth annual workshop on VietnameseLanguage and Speech Processing (VLSP 2021). The ta
The current success of modern visual reasoning systems is arguably attributed to cross-modality attention mechanisms. However, in deliberative reasoning such as in VQA, attention is unconstrained at each step, and thus may serve as a statistical pool
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::78b9745b00d24ca314152c95d6c67f5d
http://arxiv.org/abs/2205.12616
http://arxiv.org/abs/2205.12616
Publikováno v:
Lecture Notes in Computer Science ISBN: 9783031198410
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::39c7011c1c589e4197d063d4da29d0fb
https://doi.org/10.1007/978-3-031-19842-7_41
https://doi.org/10.1007/978-3-031-19842-7_41
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Publikováno v:
KDD
The rise of big data and big compute has brought modern neural networks to many walks of digital life, thanks to the relative ease of constructing large models that scale to the real world. Current successes of Transformers and self-supervised pretra
Publikováno v:
IEEE/ACM transactions on computational biology and bioinformatics. 19(2)
Predicting the interaction between a compound and a target is crucial for rapid drug repurposing. Deep learning has been successfully applied in drug-target affinity (DTA) problem. However, previous deep learning-based methods ignore modeling the dir
Publikováno v:
IJCNN
Video question answering (Video QA) presents a powerful testbed for human-like intelligent behaviors. The task demands new capabilities to integrate video processing, language understanding, binding abstract linguistic concepts to concrete visual art
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::8bb7ad4f2d481aeb52224caae45e9503
Publikováno v:
IJCAI
Video Question Answering (Video QA) is a powerful testbed to develop new AI capabilities. This task necessitates learning to reason about objects, relations, and events across visual and linguistic domains in space-time. High-level reasoning demands
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::3a2e1fa420a7213ebaba196a9c002892
Video QA challenges modelers in multiple fronts. Modeling video necessitates building not only spatio-temporal models for the dynamic visual channel but also multimodal structures for associated information channels such as subtitles or audio. Video
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::5b4db2ced1c809cb99d35ea16b69151d
http://arxiv.org/abs/2010.10019
http://arxiv.org/abs/2010.10019
Publikováno v:
IJCAI
We present Language-binding Object Graph Network, the first neural reasoning method with dynamic relational structures across both visual and textual domains with applications in visual question answering. Relaxing the common assumption made by curre