Výsledky vyhledávání

Report

VidCtx: Context-aware Video Question Answering with Image Models

Autor: Goulas, Andreas, Mezaris, Vasileios, Patras, Ioannis

To address computational and memory limitations of Large Multimodal Models in the Video Question-Answering task, several recent methods extract textual representations per frame (e.g., by captioning) and feed them to a Large Language Model (LLM) that

Externí odkaz: http://arxiv.org/abs/2412.17415

Zobrazit plný text záznamu

Report

Multimodal Outer Arithmetic Block Dual Fusion of Whole Slide Images and Omics Data for Precision Oncology

Autor: Alwazzan, Omnia, Gallagher-Syed, Amaya, Millner, Thomas O., Brandner, Sebastian, Patras, Ioannis, Marino, Silvia, Slabaugh, Gregory

The integration of DNA methylation data with a Whole Slide Image (WSI) offers significant potential for enhancing the diagnostic precision of central nervous system (CNS) tumor classification in neuropathology. While existing approaches typically int

Externí odkaz: http://arxiv.org/abs/2411.17418

Zobrazit plný text záznamu

Report

ReWind: Understanding Long Videos with Instructed Learnable Memory

Autor: Diko, Anxhelo, Wang, Tinghuai, Swaileh, Wassim, Sun, Shiyan, Patras, Ioannis

Vision-Language Models (VLMs) are crucial for applications requiring integrated understanding textual and visual information. However, existing VLMs struggle with long videos due to computational inefficiency, memory limitations, and difficulties in

Externí odkaz: http://arxiv.org/abs/2411.15556

Zobrazit plný text záznamu

Report

The Exponential Lie Series and a Chen-Strichartz Formula for Levy Processes

Autor: Ebrahimi-Fard, Kurusch, Patras, Frederic, Wiese, Anke

In this paper, we derive a Chen-Strichartz formula for stochastic differential equations driven by Levy processes, that is, we derive a series expansion of the logarithm of the flowmap of the stochastic differential equation in terms of commutators o

Externí odkaz: http://arxiv.org/abs/2411.06827

Zobrazit plný text záznamu

Report

Neural Networks Decoded: Targeted and Robust Analysis of Neural Network Decisions via Causal Explanations and Reasoning

Autor: Diallo, Alec F., Belle, Vaishak, Patras, Paul

Despite their success and widespread adoption, the opaque nature of deep neural networks (DNNs) continues to hinder trust, especially in critical applications. Current interpretability solutions often yield inconsistent or oversimplified explanations

Externí odkaz: http://arxiv.org/abs/2410.05484

Zobrazit plný text záznamu

Report

CemiFace: Center-based Semi-hard Synthetic Face Generation for Face Recognition

Autor: Sun, Zhonglin, Song, Siyang, Patras, Ioannis, Tzimiropoulos, Georgios

Privacy issue is a main concern in developing face recognition techniques. Although synthetic face images can partially mitigate potential legal risks while maintaining effective face recognition (FR) performance, FR models trained by face images syn

Externí odkaz: http://arxiv.org/abs/2409.18876

Zobrazit plný text záznamu

Report

Behaviour4All: in-the-wild Facial Behaviour Analysis Toolkit

Autor: Kollias, Dimitrios, Shao, Chunchang, Kaloidas, Odysseus, Patras, Ioannis

In this paper, we introduce Behavior4All, a comprehensive, open-source toolkit for in-the-wild facial behavior analysis, integrating Face Localization, Valence-Arousal Estimation, Basic Expression Recognition and Action Unit Detection, all within a s

Externí odkaz: http://arxiv.org/abs/2409.17717

Zobrazit plný text záznamu

Report

MM2Latent: Text-to-facial image generation and editing in GANs with multimodal assistance

Autor: Meng, Debin, Tzelepis, Christos, Patras, Ioannis, Tzimiropoulos, Georgios

Generating human portraits is a hot topic in the image generation area, e.g. mask-to-face generation and text-to-face generation. However, these unimodal generation methods lack controllability in image generation. Controllability can be enhanced by

Externí odkaz: http://arxiv.org/abs/2409.11010

Zobrazit plný text záznamu

Report

CLIPCleaner: Cleaning Noisy Labels with CLIP

Autor: Feng, Chen, Tzimiropoulos, Georgios, Patras, Ioannis

Learning with Noisy labels (LNL) poses a significant challenge for the Machine Learning community. Some of the most widely used approaches that select as clean samples for which the model itself (the in-training model) has high confidence, e.g., `sma

Externí odkaz: http://arxiv.org/abs/2408.10012

Zobrazit plný text záznamu

Report

Are CLIP features all you need for Universal Synthetic Image Origin Attribution?

Autor: Cioni, Dario, Tzelepis, Christos, Seidenari, Lorenzo, Patras, Ioannis

The steady improvement of Diffusion Models for visual synthesis has given rise to many new and interesting use cases of synthetic images but also has raised concerns about their potential abuse, which poses significant societal threats. To address th

Externí odkaz: http://arxiv.org/abs/2408.09153

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání