Výsledky vyhledávání - "Bennamoun, Mohammed"

Report

A Riemannian Approach for Spatiotemporal Analysis and Generation of 4D Tree-shaped Structures

Autor: Khanam, Tahmina, Laga, Hamid, Bennamoun, Mohammed, Wang, Guanjin, Sohel, Ferdous, Boussaid, Farid, Wang, Guan, Srivastava, Anuj

We propose the first comprehensive approach for modeling and analyzing the spatiotemporal shape variability in tree-like 4D objects, i.e., 3D objects whose shapes bend, stretch, and change in their branching structure over time as they deform, grow,

Externí odkaz: http://arxiv.org/abs/2408.12443

Zobrazit plný text záznamu

Report

Faster Image2Video Generation: A Closer Look at CLIP Image Embedding's Impact on Spatio-Temporal Cross-Attentions

Autor: Taghipour, Ashkan, Ghahremani, Morteza, Bennamoun, Mohammed, Rekavandi, Aref Miri, Li, Zinuo, Laga, Hamid, Boussaid, Farid

This paper investigates the role of CLIP image embeddings within the Stable Video Diffusion (SVD) framework, focusing on their impact on video generation quality and computational efficiency. Our findings indicate that CLIP embeddings, while crucial

Externí odkaz: http://arxiv.org/abs/2407.19205

Zobrazit plný text záznamu

Report

DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition

Autor: Wang, Qi, Xu, Zhou, Lin, Yuming, Ye, Jingtao, Li, Hongsheng, Zhu, Guangming, Shah, Syed Afaq Ali, Bennamoun, Mohammed, Zhang, Liang

Neuromorphic sensors, specifically event cameras, revolutionize visual data acquisition by capturing pixel intensity changes with exceptional dynamic range, minimal latency, and energy efficiency, setting them apart from conventional frame-based came

Externí odkaz: http://arxiv.org/abs/2407.05106

Zobrazit plný text záznamu

Report

Deep Learning-based Depth Estimation Methods from Monocular Image and Videos: A Comprehensive Survey

Autor: Rajapaksha, Uchitha, Sohel, Ferdous, Laga, Hamid, Diepeveen, Dean, Bennamoun, Mohammed

Estimating depth from single RGB images and videos is of widespread interest due to its applications in many areas, including autonomous driving, 3D reconstruction, digital entertainment, and robotics. More than 500 deep learning-based papers have be

Externí odkaz: http://arxiv.org/abs/2406.19675

Zobrazit plný text záznamu

Report

Supervised Radio Frequency Interference Detection with SNNs

Autor: Pritchard, Nicholas J., Wicenec, Andreas, Bennamoun, Mohammed, Dodson, Richard

Radio Frequency Interference (RFI) poses a significant challenge in radio astronomy, arising from terrestrial and celestial sources, disrupting observations conducted by radio telescopes. Addressing RFI involves intricate heuristic algorithms, manual

Externí odkaz: http://arxiv.org/abs/2406.06075

Zobrazit plný text záznamu

Report

CPLIP: Zero-Shot Learning for Histopathology with Comprehensive Vision-Language Alignment

Autor: Javed, Sajid, Mahmood, Arif, Ganapathi, Iyyakutti Iyappan, Dharejo, Fayaz Ali, Werghi, Naoufel, Bennamoun, Mohammed

This paper proposes Comprehensive Pathology Language Image Pre-training (CPLIP), a new unsupervised technique designed to enhance the alignment of images and text in histopathology for tasks such as classification and segmentation. This methodology e

Externí odkaz: http://arxiv.org/abs/2406.05205

Zobrazit plný text záznamu

Report

Language Model Guided Interpretable Video Action Reasoning

Autor: Wang, Ning, Zhu, Guangming, Li, HS, Zhang, Liang, Shah, Syed Afaq Ali, Bennamoun, Mohammed

While neural networks have excelled in video action recognition tasks, their black-box nature often obscures the understanding of their decision-making processes. Recent approaches used inherently interpretable models to analyze video actions in a ma

Externí odkaz: http://arxiv.org/abs/2404.01591

Zobrazit plný text záznamu

Report

Towards Temporally Consistent Referring Video Object Segmentation

Autor: Miao, Bo, Bennamoun, Mohammed, Gao, Yongsheng, Shah, Mubarak, Mian, Ajmal

Referring Video Object Segmentation (R-VOS) methods face challenges in maintaining consistent object segmentation due to temporal context variability and the presence of other visually similar objects. We propose an end-to-end R-VOS paradigm that exp

Externí odkaz: http://arxiv.org/abs/2403.19407

Zobrazit plný text záznamu

Report

Auxiliary Tasks Enhanced Dual-affinity Learning for Weakly Supervised Semantic Segmentation

Autor: Xu, Lian, Bennamoun, Mohammed, Boussaid, Farid, Ouyang, Wanli, Sohel, Ferdous, Xu, Dan

Most existing weakly supervised semantic segmentation (WSSS) methods rely on Class Activation Mapping (CAM) to extract coarse class-specific localization maps using image-level labels. Prior works have commonly used an off-line heuristic thresholding

Externí odkaz: http://arxiv.org/abs/2403.01156

Zobrazit plný text záznamu

Report

Box It to Bind It: Unified Layout Control and Attribute Binding in T2I Diffusion Models

Autor: Taghipour, Ashkan, Ghahremani, Morteza, Bennamoun, Mohammed, Rekavandi, Aref Miri, Laga, Hamid, Boussaid, Farid

While latent diffusion models (LDMs) excel at creating imaginative images, they often lack precision in semantic fidelity and spatial control over where objects are generated. To address these deficiencies, we introduce the Box-it-to-Bind-it (B2B) mo

Externí odkaz: http://arxiv.org/abs/2402.17910

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání