Zobrazeno 1 - 10
of 1 611
pro vyhledávání: '"Karatzas P"'
Modern edge data centers simultaneously handle multiple Deep Neural Networks (DNNs), leading to significant challenges in workload management. Thus, current management systems must leverage the architectural heterogeneity of new embedded systems to e
Externí odkaz:
http://arxiv.org/abs/2411.17867
Autor:
Paramanayakam, Varatheepan, Karatzas, Andreas, Anagnostopoulos, Iraklis, Stamoulis, Dimitrios
The advanced function-calling capabilities of foundation models open up new possibilities for deploying agents to perform complex API tasks. However, managing large amounts of data and interacting with numerous APIs makes function calling hardware-in
Externí odkaz:
http://arxiv.org/abs/2411.15399
Autor:
Tobaben, Marlon, Souibgui, Mohamed Ali, Tito, Rubèn, Nguyen, Khanh, Kerkouche, Raouf, Jung, Kangsoo, Jälkö, Joonas, Kang, Lei, Barsky, Andrey, d'Andecy, Vincent Poulain, Joseph, Aurélie, Muhamed, Aashiq, Kuo, Kevin, Smith, Virginia, Yamasaki, Yusuke, Fukami, Takumi, Niwa, Kenta, Tyou, Iifan, Ishii, Hiro, Yokota, Rio, N, Ragul, Kutum, Rintu, Llados, Josep, Valveny, Ernest, Honkela, Antti, Fritz, Mario, Karatzas, Dimosthenis
The Privacy Preserving Federated Learning Document VQA (PFL-DocVQA) competition challenged the community to develop provably private and communication-efficient solutions in a federated setting for a real-life use case: invoice processing. The compet
Externí odkaz:
http://arxiv.org/abs/2411.03730
The comic domain is rapidly advancing with the development of single- and multi-page analysis and synthesis models. Recent benchmarks and datasets have been introduced to support and assess models' capabilities in tasks such as detection (panels, cha
Externí odkaz:
http://arxiv.org/abs/2409.16159
Autor:
Vivoli, Emanuele, Barsky, Andrey, Souibgui, Mohamed Ali, LLabres, Artemis, Bertini, Marco, Karatzas, Dimosthenis
Vision-language models have recently evolved into versatile systems capable of high performance across a range of tasks, such as document understanding, visual question answering, and grounding, often in zero-shot settings. Comics Understanding, a co
Externí odkaz:
http://arxiv.org/abs/2409.09502
Autor:
Kang, Lei, Yang, Fei, Wang, Kai, Souibgui, Mohamed Ali, Gomez, Lluis, Fornés, Alicia, Valveny, Ernest, Karatzas, Dimosthenis
Fonts are integral to creative endeavors, design processes, and artistic productions. The appropriate selection of a font can significantly enhance artwork and endow advertisements with a higher level of expressivity. Despite the availability of nume
Externí odkaz:
http://arxiv.org/abs/2408.07259
We address the problem of detecting and mapping all books in a collection of images to entries in a given book catalogue. Instead of performing independent retrieval for each book detected, we treat the image-text mapping problem as a many-to-many ma
Externí odkaz:
http://arxiv.org/abs/2407.19812
The comic domain is rapidly advancing with the development of single-page analysis and synthesis models. However, evaluation metrics and datasets lag behind, often limited to small-scale or single-style test sets. We introduce a novel benchmark, CoMi
Externí odkaz:
http://arxiv.org/abs/2407.03550
Autor:
Vivoli, Emanuele, Campaioli, Irene, Nardoni, Mariateresa, Biondi, Niccolò, Bertini, Marco, Karatzas, Dimosthenis
Comics, as a medium, uniquely combine text and images in styles often distinct from real-world visuals. For the past three decades, computational research on comics has evolved from basic object detection to more sophisticated tasks. However, the fie
Externí odkaz:
http://arxiv.org/abs/2407.03540
Autor:
Singh, Simranjit, Fore, Michael, Karatzas, Andreas, Lee, Chaehong, Jian, Yanan, Shangguan, Longfei, Yu, Fuxun, Anagnostopoulos, Iraklis, Stamoulis, Dimitrios
As Large Language Models (LLMs) broaden their capabilities to manage thousands of API calls, they are confronted with complex data operations across vast datasets with significant overhead to the underlying system. In this work, we introduce LLM-dCac
Externí odkaz:
http://arxiv.org/abs/2406.06799