Zobrazeno 1 - 10
of 421
pro vyhledávání: '"PARDO, Alejandro"'
Autor:
Pardo, Alejandro, Pizzati, Fabio, Zhang, Tong, Pondaven, Alexander, Torr, Philip, Perez, Juan Camilo, Ghanem, Bernard
Match-cuts are powerful cinematic tools that create seamless transitions between scenes, delivering strong visual and metaphorical connections. However, crafting match-cuts is a challenging, resource-intensive process requiring deliberate artistic pl
Externí odkaz:
http://arxiv.org/abs/2411.18677
Autor:
Pardo, Alejandro, Wang, Jui-Hsien, Ghanem, Bernard, Sivic, Josef, Russell, Bryan, Heilbron, Fabian Caba
The objective of this work is to manipulate visual timelines (e.g. a video) through natural language instructions, making complex timeline editing tasks accessible to non-expert or potentially even disabled users. We call this task Instructed visual
Externí odkaz:
http://arxiv.org/abs/2411.12293
Autor:
Pérez, Juan C., Pardo, Alejandro, Soldan, Mattia, Itani, Hani, Leon-Alcazar, Juan, Ghanem, Bernard
This study investigates whether Compressed-Language Models (CLMs), i.e. language models operating on raw byte streams from Compressed File Formats~(CFFs), can understand files compressed by CFFs. We focus on the JPEG format as a representative CFF, g
Externí odkaz:
http://arxiv.org/abs/2405.17146
Understanding videos that contain multiple modalities is crucial, especially in egocentric videos, where combining various sensory inputs significantly improves tasks like action recognition and moment localization. However, real-world applications o
Externí odkaz:
http://arxiv.org/abs/2404.15161
Autor:
Argaw, Dawit Mureja, Soldan, Mattia, Pardo, Alejandro, Zhao, Chen, Heilbron, Fabian Caba, Chung, Joon Son, Ghanem, Bernard
Movie trailers are an essential tool for promoting films and attracting audiences. However, the process of creating trailers can be time-consuming and expensive. To streamline this process, we propose an automatic trailer generation framework that ge
Externí odkaz:
http://arxiv.org/abs/2404.03477
Multimodal video understanding is crucial for analyzing egocentric videos, where integrating multiple sensory signals significantly enhances action recognition and moment localization. However, practical applications often grapple with incomplete mod
Externí odkaz:
http://arxiv.org/abs/2401.11470
Autor:
Alfarra, Motasem, Itani, Hani, Pardo, Alejandro, Alhuwaider, Shyma, Ramazanova, Merey, Pérez, Juan C., Cai, Zhipeng, Müller, Matthias, Ghanem, Bernard
This paper proposes a novel online evaluation protocol for Test Time Adaptation (TTA) methods, which penalizes slower methods by providing them with fewer samples for adaptation. TTA methods leverage unlabeled data at test time to adapt to distributi
Externí odkaz:
http://arxiv.org/abs/2304.04795
Autor:
Soldan, Mattia, Pardo, Alejandro, Alcázar, Juan León, Heilbron, Fabian Caba, Zhao, Chen, Giancola, Silvio, Ghanem, Bernard
Publikováno v:
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition CVPR 2022
The recent and increasing interest in video-language research has driven the development of large-scale datasets that enable data-intensive machine learning techniques. In comparison, limited effort has been made at assessing the fitness of these dat
Externí odkaz:
http://arxiv.org/abs/2112.00431
Autor:
González Arbeláez, Luisa Fernanda, Ciocci Pardo, Alejandro, Burgos, Juan Ignacio, Vila Petroff, Martín Gerardo, Godoy Coto, Joshua, Ennis, Irene Lucía, Mosca, Susana María, Fantinelli, Juliana Catalina
Publikováno v:
In Archives of Biochemistry and Biophysics August 2024 758