Zobrazeno 1 - 10
of 1 578
pro vyhledávání: '"Lladós"'
Autor:
Tobaben, Marlon, Souibgui, Mohamed Ali, Tito, Rubèn, Nguyen, Khanh, Kerkouche, Raouf, Jung, Kangsoo, Jälkö, Joonas, Kang, Lei, Barsky, Andrey, d'Andecy, Vincent Poulain, Joseph, Aurélie, Muhamed, Aashiq, Kuo, Kevin, Smith, Virginia, Yamasaki, Yusuke, Fukami, Takumi, Niwa, Kenta, Tyou, Iifan, Ishii, Hiro, Yokota, Rio, N, Ragul, Kutum, Rintu, Llados, Josep, Valveny, Ernest, Honkela, Antti, Fritz, Mario, Karatzas, Dimosthenis
The Privacy Preserving Federated Learning Document VQA (PFL-DocVQA) competition challenged the community to develop provably private and communication-efficient solutions in a federated setting for a real-life use case: invoice processing. The compet
Externí odkaz:
http://arxiv.org/abs/2411.03730
Although foundational vision-language models (VLMs) have proven to be very successful for various semantic discrimination tasks, they still struggle to perform faithfully for fine-grained categorization. Moreover, foundational models trained on one d
Externí odkaz:
http://arxiv.org/abs/2409.01835
The proliferation of scene text in both structured and unstructured environments presents significant challenges in optical character recognition (OCR), necessitating more efficient and robust text spotting solutions. This paper presents FastTextSpot
Externí odkaz:
http://arxiv.org/abs/2408.14998
Autor:
Pilligua, Maria, Biescas, Nil, Vazquez-Corral, Javier, Lladós, Josep, Valveny, Ernest, Biswas, Sanket
The rapid evolution of intelligent document processing systems demands robust solutions that adapt to diverse domains without extensive retraining. Traditional methods often falter with variable document types, leading to poor performance. To overcom
Externí odkaz:
http://arxiv.org/abs/2406.08610
Autor:
Biswas, Sanket, Jain, Rajiv, Morariu, Vlad I., Gu, Jiuxiang, Mathur, Puneet, Wigington, Curtis, Sun, Tong, Lladós, Josep
While the generation of document layouts has been extensively explored, comprehensive document generation encompassing both layout and content presents a more complex challenge. This paper delves into this advanced domain, proposing a novel approach
Externí odkaz:
http://arxiv.org/abs/2406.08354
Autor:
Van Landeghem, Jordy, Maity, Subhajit, Banerjee, Ayan, Blaschko, Matthew, Moens, Marie-Francine, Lladós, Josep, Biswas, Sanket
This work explores knowledge distillation (KD) for visually-rich document (VRD) applications such as document layout analysis (DLA) and document image classification (DIC). While VRD research is dependent on increasingly sophisticated and cumbersome
Externí odkaz:
http://arxiv.org/abs/2406.08226
This paper introduces Fetch-A-Set (FAS), a comprehensive benchmark tailored for legislative historical document analysis systems, addressing the challenges of large-scale document retrieval in historical contexts. The benchmark comprises a vast repos
Externí odkaz:
http://arxiv.org/abs/2406.07315
This paper presents GeoContrastNet, a language-agnostic framework to structured document understanding (DU) by integrating a contrastive learning objective with graph attention networks (GATs), emphasizing the significant role of geometric features.
Externí odkaz:
http://arxiv.org/abs/2405.03104
We present SketchGPT, a flexible framework that employs a sequence-to-sequence autoregressive model for sketch generation, and completion, and an interpretation case study for sketch recognition. By mapping complex sketches into simplified sequences
Externí odkaz:
http://arxiv.org/abs/2405.03099
Generating VectorArt from text prompts is a challenging vision task, requiring diverse yet realistic depictions of the seen as well as unseen entities. However, existing research has been mostly limited to the generation of single objects, rather tha
Externí odkaz:
http://arxiv.org/abs/2404.00412