Výsledky vyhledávání

Report

Text-Enhanced Zero-Shot Action Recognition: A training-free approach

Autor: Bosetti, Massimo, Zhang, Shibingfeng, Liberatori, Benedetta, Zara, Giacomo, Ricci, Elisa, Rota, Paolo

Vision-language models (VLMs) have demonstrated remarkable performance across various visual tasks, leveraging joint learning of visual and textual representations. While these models excel in zero-shot image tasks, their application to zero-shot vid

Externí odkaz: http://arxiv.org/abs/2408.16412

Zobrazit plný text záznamu

Report

Automatic benchmarking of large multimodal models via iterative experiment programming

Autor: Conti, Alessandro, Fini, Enrico, Rota, Paolo, Wang, Yiming, Mancini, Massimiliano, Ricci, Elisa

Assessing the capabilities of large multimodal models (LMMs) often requires the creation of ad-hoc evaluations. Currently, building new benchmarks requires tremendous amounts of manual work for each specific analysis. This makes the evaluation proces

Externí odkaz: http://arxiv.org/abs/2406.12321

Zobrazit plný text záznamu

Report

Four microlensing giant planets detected through signals produced by minor-image perturbations

Autor: Han, Cheongho, Bond, Ian A., Lee, Chung-Uk, Gould, Andrew, Albrow, Michael D., Chung, Sun-Ju, Hwang, Kyu-Ha, Jung, Youn Kil, Ryu, Yoon-Hyun, Shvartzvald, Yossi, Shin, In-Gu, Yee, Jennifer C., Yang, Hongjing, Zang, Weicheng, Cha, Sang-Mok, Kim, Doeon, Kim, Dong-Jin, Kim, Seung-Lee, Lee, Dong-Joo, Lee, Yongseok, Park, Byeong-Gon, Pogge, Richard W., Abe, Fumio, Bando, Ken, Barry, Richard, Bennett, David P., Bhattacharya, Aparna, Fujii, Hirosame, Fukui, Akihiko, Hamada, Ryusei, Hamasaki, Shunya Hamada Naoto, Hirao, Yuki, Silva, Stela Ishitani, Itow, Yoshitaka, Kirikawa, Rintaro, Koshimoto, Naoki, Matsubara, Yutaka, Miyazaki, Shota, Muraki, Yasushi, Nagai, Tutumi, Nunota, Kansuke, Olmschenk, Greg, Ranc, Clément, Rattenbury, Nicholas J., Satoh, Yuki, Sumi, Takahiro, Suzuki, Daisuke, Tomoyoshi, Mio, Tristram, Paul J., Vandorou, Aikaterini, Yama, Hibiki, Yamashita, Kansuke, Bachelet, Etienne, Rota, Paolo, Bozza, Valerio, Zielinski, Paweł, Street, Rachel A., Tsapras, Yiannis, Hundertmark, Markus, Wambsganss, Joachim, Wyrzykowski, Łukasz, Jaimes, Roberto Figuera, Cassan, Arnaud, Dominik, Martin, Rybicki, Krzysztof A., Rabus, Markus

We investigated the nature of the anomalies appearing in four microlensing events KMT-2020-BLG-0757, KMT-2022-BLG-0732, KMT-2022-BLG-1787, and KMT-2022-BLG-1852. The light curves of these events commonly exhibit initial bumps followed by subsequent t

Externí odkaz: http://arxiv.org/abs/2406.10547

Zobrazit plný text záznamu

Report

Adv-KD: Adversarial Knowledge Distillation for Faster Diffusion Sampling

Autor: Mekonnen, Kidist Amde, Dall'Asen, Nicola, Rota, Paolo

Diffusion Probabilistic Models (DPMs) have emerged as a powerful class of deep generative models, achieving remarkable performance in image synthesis tasks. However, these models face challenges in terms of widespread adoption due to their reliance o

Externí odkaz: http://arxiv.org/abs/2405.20675

Zobrazit plný text záznamu

Report

Vocabulary-free Image Classification and Semantic Segmentation

Autor: Conti, Alessandro, Fini, Enrico, Mancini, Massimiliano, Rota, Paolo, Wang, Yiming, Ricci, Elisa

Large vision-language models revolutionized image classification and semantic segmentation paradigms. However, they typically assume a pre-defined set of categories, or vocabulary, at test time for composing textual prompts. This assumption is imprac

Externí odkaz: http://arxiv.org/abs/2404.10864

Zobrazit plný text záznamu

Report

Socially Pertinent Robots in Gerontological Healthcare

Despite the many recent achievements in developing and deploying social robotics, there are still many underexplored environments and applications for which systematic evaluation of such systems by end-users is necessary. While several robotic platfo

Externí odkaz: http://arxiv.org/abs/2404.07560

Zobrazit plný text záznamu

Report

Test-Time Zero-Shot Temporal Action Localization

Autor: Liberatori, Benedetta, Conti, Alessandro, Rota, Paolo, Wang, Yiming, Ricci, Elisa

Zero-Shot Temporal Action Localization (ZS-TAL) seeks to identify and locate actions in untrimmed videos unseen during training. Existing ZS-TAL methods involve fine-tuning a model on a large amount of annotated training data. While effective, traini

Externí odkaz: http://arxiv.org/abs/2404.05426

Zobrazit plný text záznamu

Report

Optical monitoring of the Didymos-Dimorphos asteroid system with the Danish telescope around the DART mission impact

The NASA's Double-Asteroid Redirection Test (DART) was a unique planetary defence and technology test mission, the first of its kind. The main spacecraft of the DART mission impacted the target asteroid Dimorphos, a small moon orbiting asteroid (6580

Externí odkaz: http://arxiv.org/abs/2311.01982

Zobrazit plný text záznamu

Report

The Unreasonable Effectiveness of Large Language-Vision Models for Source-free Video Domain Adaptation

Autor: Zara, Giacomo, Conti, Alessandro, Roy, Subhankar, Lathuilière, Stéphane, Rota, Paolo, Ricci, Elisa

Source-Free Video Unsupervised Domain Adaptation (SFVUDA) task consists in adapting an action recognition model, trained on a labelled source dataset, to an unlabelled target dataset, without accessing the actual source data. The previous approaches

Externí odkaz: http://arxiv.org/abs/2308.09139

Zobrazit plný text záznamu

Report

Vocabulary-free Image Classification

Autor: Conti, Alessandro, Fini, Enrico, Mancini, Massimiliano, Rota, Paolo, Wang, Yiming, Ricci, Elisa

Recent advances in large vision-language models have revolutionized the image classification paradigm. Despite showing impressive zero-shot capabilities, a pre-defined set of categories, a.k.a. the vocabulary, is assumed at test time for composing th

Externí odkaz: http://arxiv.org/abs/2306.00917

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání