Výsledky vyhledávání

Report

Ranked from Within: Ranking Large Multimodal Models for Visual Question Answering Without Labels

Autor: Tu, Weijie, Deng, Weijian, Campbell, Dylan, Yao, Yu, Zheng, Jiyang, Gedeon, Tom, Liu, Tongliang

As large multimodal models (LMMs) are increasingly deployed across diverse applications, the need for adaptable, real-world model ranking has become paramount. Traditional evaluation methods are largely dataset-centric, relying on fixed, labeled data

Externí odkaz: http://arxiv.org/abs/2412.06461

Zobrazit plný text záznamu

Report

When Spatial meets Temporal in Action Recognition

Autor: Chen, Huilin, Wang, Lei, Chen, Yifan, Gedeon, Tom, Koniusz, Piotr

Video action recognition has made significant strides, but challenges remain in effectively using both spatial and temporal information. While existing methods often focus on either spatial features (e.g., object appearance) or temporal dynamics (e.g

Externí odkaz: http://arxiv.org/abs/2411.15284

Zobrazit plný text záznamu

Report

Dynamics of a state-dependent delay-differential equation

Autor: Gedeon, Tomas, Humphries, Antony R., Mackey, Michael C., Walther, Hans-Otto, Zhao, Wang

We present a detailed study of a scalar differential equation with threshold state-dependent delayed feedback. This equation arises as a simplification of a gene regulatory model. There are two monotone nonlinearities in the model: one describes the

Externí odkaz: http://arxiv.org/abs/2410.13092

Zobrazit plný text záznamu

Report

Visual Prompting in LLMs for Enhancing Emotion Recognition

Autor: Zhang, Qixuan, Wang, Zhifeng, Zhang, Dylan, Niu, Wenjia, Caldwell, Sabrina, Gedeon, Tom, Liu, Yang, Qin, Zhenyue

Vision Large Language Models (VLLMs) are transforming the intersection of computer vision and natural language processing. Nonetheless, the potential of using visual prompts for emotion recognition in these models remains largely unexplored and untap

Externí odkaz: http://arxiv.org/abs/2410.02244

Zobrazit plný text záznamu

Report

Toward a Holistic Evaluation of Robustness in CLIP Models

Autor: Tu, Weijie, Deng, Weijian, Gedeon, Tom

Contrastive Language-Image Pre-training (CLIP) models have shown significant potential, particularly in zero-shot classification across diverse distribution shifts. Building on existing evaluations of overall classification robustness, this work aims

Externí odkaz: http://arxiv.org/abs/2410.01534

Zobrazit plný text záznamu

Report

LEGO: Learnable Expansion of Graph Operators for Multi-Modal Feature Fusion

Autor: Ding, Dexuan, Wang, Lei, Zhu, Liyun, Gedeon, Tom, Koniusz, Piotr

In computer vision tasks, features often come from diverse representations, domains, and modalities, such as text, images, and videos. Effectively fusing these features is essential for robust performance, especially with the availability of powerful

Externí odkaz: http://arxiv.org/abs/2410.01506

Zobrazit plný text záznamu

Report

TrackNetV4: Enhancing Fast Sports Object Tracking with Motion Attention Maps

Autor: Raj, Arjun, Wang, Lei, Gedeon, Tom

Accurately detecting and tracking high-speed, small objects, such as balls in sports videos, is challenging due to factors like motion blur and occlusion. Although recent deep learning frameworks like TrackNetV1, V2, and V3 have advanced tennis ball

Externí odkaz: http://arxiv.org/abs/2409.14543

Zobrazit plný text záznamu

Report

Machine Learning to Detect Anxiety Disorders from Error-Related Negativity and EEG Signals

Autor: Chandrasekar, Ramya, Hasan, Md Rakibul, Ghosh, Shreya, Gedeon, Tom, Hossain, Md Zakir

Anxiety is a common mental health condition characterised by excessive worry, fear and apprehension about everyday situations. Even with significant progress over the past few years, predicting anxiety from electroencephalographic (EEG) signals, spec

Externí odkaz: http://arxiv.org/abs/2410.00028

Zobrazit plný text záznamu

Report

MRAC Track 1: 2nd Workshop on Multimodal, Generative and Responsible Affective Computing

Autor: Ghosh, Shreya, Cai, Zhixi, Dhall, Abhinav, Kollias, Dimitrios, Goecke, Roland, Gedeon, Tom

With the rapid advancements in multimodal generative technology, Affective Computing research has provoked discussion about the potential consequences of AI systems equipped with emotional intelligence. Affective Computing involves the design, evalua

Externí odkaz: http://arxiv.org/abs/2409.07256

Zobrazit plný text záznamu

Report

MIP-GAF: A MLLM-annotated Benchmark for Most Important Person Localization and Group Context Understanding

Autor: Madan, Surbhi, Ghosh, Shreya, Sookha, Lownish Rai, Ganaie, M. A., Subramanian, Ramanathan, Dhall, Abhinav, Gedeon, Tom

Estimating the Most Important Person (MIP) in any social event setup is a challenging problem mainly due to contextual complexity and scarcity of labeled data. Moreover, the causality aspects of MIP estimation are quite subjective and diverse. To thi

Externí odkaz: http://arxiv.org/abs/2409.06224

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání