Zobrazeno 1 - 10
of 13
pro vyhledávání: '"Andres Mafla"'
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Publikováno v:
Lecture Notes in Computer Science ISBN: 9783031250682
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::4e61563427faf40f048bb27ad2dcaf01
https://doi.org/10.1007/978-3-031-25069-9_23
https://doi.org/10.1007/978-3-031-25069-9_23
Autor:
Dimosthenis Karatzas, Ernest Valveny, Marçal Rusiñol, Ali Furkan Biten, Lluis Gomez, Andres Mafla, Rubèn Tito
Publikováno v:
Pattern Recognition Letters. 150:242-249
This paper presents a new model for the task of scene text visual question answering, in which questions about a given image can only be answered by reading and understanding scene text that is present in it. The proposed model is based on an attenti
Publikováno v:
WACV
Scene text instances found in natural images carry explicit semantic information that can provide important cues to solve a wide array of computer vision problems. In this paper, we focus on leveraging multi-modal content in the form of visual and te
Publikováno v:
WACV
Recent models for cross-modal retrieval have benefited from an increasingly rich understanding of visual scenes, afforded by scene graphs and object interactions to mention a few. This has resulted in an improved matching between the visual represent
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::fe953203bf2006752f91a6bbaecce237
http://arxiv.org/abs/2012.04329
http://arxiv.org/abs/2012.04329
Publikováno v:
WACV
Text contained in an image carries high-level semantics that can be exploited to achieve richer image understanding. In particular, the mere presence of text provides strong guiding content that should be employed to tackle a diversity of computer vi
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::fb53162ab8f4715fdb97d2263a324f5a
Autor:
Ali Furkan Biten, Dimosthenis Karatzas, C. V. Jawahar, Marçal Rusiñol, Rubèn Tito, Lluis Gomez, Andres Mafla, Ernest Valveny
Publikováno v:
ICCV
Current visual question answering datasets do not consider the rich semantic information conveyed by text within an image. In this work, we present a new dataset, ST-VQA, that aims to highlight the importance of exploiting high-level semantic informa
Autor:
C. V. Jawahar, Lluis Gomez, Dimosthenis Karatzas, Marçal Rusiñol, Ernest Valveny, Ali Furkan Biten, Andres Mafla, Minesh Mathew, Rubèn Tito
Publikováno v:
ICDAR
2019 International Conference on Document Analysis and Recognition (ICDAR)
2019 International Conference on Document Analysis and Recognition (ICDAR)
This paper presents final results of ICDAR 2019 Scene Text Visual Question Answering competition (ST-VQA). ST-VQA introduces an important aspect that is not addressed by any Visual Question Answering system up to date, namely the incorporation of sce
Autor:
Marçal Rusiñol, Andres Mafla, Lluis Gomez, Ernest Valveny, Sounak Dey, Rubèn Tito, Dimosthenis Karatzas
Publikováno v:
Pattern Recognition. 110:107656
In this work, we address the task of scene text retrieval: given a text query, the system returns all images containing the queried text. The proposed model uses a single shot CNN architecture that predicts bounding boxes and builds a compact represe
Publikováno v:
Computer Vision – ECCV 2018 ISBN: 9783030012632
ECCV (14)
Lecture Notes in Computer Science
Lecture Notes in Computer Science-Computer Vision – ECCV 2018
ECCV (14)
Lecture Notes in Computer Science
Lecture Notes in Computer Science-Computer Vision – ECCV 2018
Textual information found in scene images provides high level semantic information about the image and its context and it can be leveraged for better scene understanding. In this paper we address the problem of scene text retrieval: given a text quer
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::dd1ec81189f8faca0144788fcea9409e
https://doi.org/10.1007/978-3-030-01264-9_43
https://doi.org/10.1007/978-3-030-01264-9_43