Výsledky vyhledávání - "Ernest Valveny"

Akademický článek

EKTVQA: Generalized Use of External Knowledge to Empower Scene Text in Text-VQA

Autor: Arka Ujjal Dey, Ernest Valveny, Gaurav Harit

Publikováno v: IEEE Access, Vol 10, Pp 72092-72106 (2022)

The open-ended question answering task of Text-VQA often requires reading and reasoning about rarely seen or completely unseen scene text content of an image. We address this zero-shot nature of the task by proposing the generalized use of external k

Externí odkaz: https://doaj.org/article/0aa82099de6941e4ad0f990d0ff5084f

Zobrazit plný text záznamu

Report

Multimodal Transformer for Comics Text-Cloze

Autor: Vivoli, Emanuele, Baeza, Joan Lafuente, Llobet, Ernest Valveny, Karatzas, Dimosthenis

This work explores a closure task in comics, a medium where visual and textual elements are intricately intertwined. Specifically, Text-cloze refers to the task of selecting the correct text to use in a comic panel, given its neighboring panels. Trad

Externí odkaz: http://arxiv.org/abs/2403.03719

Zobrazit plný text záznamu

OCR-IDL: OCR Annotations for Industry Document Library Dataset

Autor: Ali Furkan Biten, Rubèn Tito, Lluis Gomez, Ernest Valveny, Dimosthenis Karatzas

Publikováno v: Lecture Notes in Computer Science ISBN: 9783031250682

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::4cba3c96ecf2fee792602e087c866df7
https://doi.org/10.1007/978-3-031-25069-9_16

Zobrazit plný text záznamu

Multimodal grid features and cell pointers for scene text visual question answering

Autor: Dimosthenis Karatzas, Ernest Valveny, Marçal Rusiñol, Ali Furkan Biten, Lluis Gomez, Andres Mafla, Rubèn Tito

Publikováno v: Pattern Recognition Letters. 150:242-249

This paper presents a new model for the task of scene text visual question answering, in which questions about a given image can only be answered by reading and understanding scene text that is present in it. The proposed model is based on an attenti

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::d8ed10e7bce429acee75bf6dabb64e94
https://doi.org/10.1016/j.patrec.2021.06.026

Zobrazit plný text záznamu

Beyond visual semantics: Exploring the role of scene text in image understanding

Autor: Ernest Valveny, Gaurav Harit, Arka Ujjal Dey, Suman K. Ghosh

Publikováno v: Pattern Recognition Letters. 149:164-171

Images with visual and scene text content are ubiquitous in everyday life. However, current image interpretation systems are mostly limited to using only the visual features, neglecting to leverage the scene text content. In this paper, we propose to

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::711fd087b5e60f4482386390b703fd06
https://doi.org/10.1016/j.patrec.2021.06.011

Zobrazit plný text záznamu

TRANSVERSAL APPROACH TO CURRICULUM DESIGN - HUMAN-CENTERED ARTIFICIAL INTELLIGENCE

Autor: Ernest Valveny, Ramon Vilanova

Publikováno v: EDULEARN Proceedings.

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::0931de49a71a4f35a66d27be6d2baf7b
https://doi.org/10.21125/edulearn.2022.2469

Zobrazit plný text záznamu

InfographicVQA

Autor: Minesh Mathew, Viraj Bagal, Ruben Tito, Dimosthenis Karatzas, Ernest Valveny, C. V. Jawahar

Publikováno v: 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::0c940b52571b7d1cab0a7e8ae6dd88e8
https://doi.org/10.1109/wacv51458.2022.00264

Zobrazit plný text záznamu

EKTVQA: Generalized use of External Knowledge to empower Scene Text in Text-VQA

Autor: Arka Ujjal Dey, Ernest Valveny, Gaurav Harit

The open-ended question answering task of Text-VQA often requires reading and reasoning about rarely seen or completely unseen scene-text content of an image. We address this zero-shot nature of the problem by proposing the generalized use of externa

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::96ee2e538bbb532e8813296779f543c2

Zobrazit plný text záznamu

ICDAR 2021 Competition on Document Visual Question Answering

Autor: Dimosthenis Karatzas, Minesh Mathew, C. V. Jawahar, Ernest Valveny, Rubèn Tito

Publikováno v: Document Analysis and Recognition – ICDAR 2021 ISBN: 9783030863364
ICDAR (4)

In this report we present results of the ICDAR 2021 edition of the Document Visual Question Challenges. This edition complements the previous tasks on Single Document VQA and Document Collection VQA with a newly introduced on Infographics VQA. Infogr

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::9aba9eef8f89b34e7e0d75b0014c8921
https://doi.org/10.1007/978-3-030-86337-1_42

Zobrazit plný text záznamu

Document Collection Visual Question Answering

Autor: Dimosthenis Karatzas, Rubèn Tito, Ernest Valveny

Publikováno v: Document Analysis and Recognition – ICDAR 2021 ISBN: 9783030863302
ICDAR (2)

Current tasks and methods in Document Understanding aims to process documents as single elements. However, documents are usually organized in collections (historical records, purchase invoices), that provide context useful for their interpretation. T

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::03d5eae05b8e0b683f84b095c180b546
https://doi.org/10.1007/978-3-030-86331-9_50

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání