Zobrazeno 1 - 10
of 876
pro vyhledávání: '"Bennamoun, Mohammed"'
Autor:
Khanam, Tahmina, Laga, Hamid, Bennamoun, Mohammed, Wang, Guanjin, Sohel, Ferdous, Boussaid, Farid, Wang, Guan, Srivastava, Anuj
We propose the first comprehensive approach for modeling and analyzing the spatiotemporal shape variability in tree-like 4D objects, i.e., 3D objects whose shapes bend, stretch, and change in their branching structure over time as they deform, grow,
Externí odkaz:
http://arxiv.org/abs/2408.12443
Autor:
Taghipour, Ashkan, Ghahremani, Morteza, Bennamoun, Mohammed, Rekavandi, Aref Miri, Li, Zinuo, Laga, Hamid, Boussaid, Farid
This paper investigates the role of CLIP image embeddings within the Stable Video Diffusion (SVD) framework, focusing on their impact on video generation quality and computational efficiency. Our findings indicate that CLIP embeddings, while crucial
Externí odkaz:
http://arxiv.org/abs/2407.19205
Autor:
Wang, Qi, Xu, Zhou, Lin, Yuming, Ye, Jingtao, Li, Hongsheng, Zhu, Guangming, Shah, Syed Afaq Ali, Bennamoun, Mohammed, Zhang, Liang
Neuromorphic sensors, specifically event cameras, revolutionize visual data acquisition by capturing pixel intensity changes with exceptional dynamic range, minimal latency, and energy efficiency, setting them apart from conventional frame-based came
Externí odkaz:
http://arxiv.org/abs/2407.05106
Deep Learning-based Depth Estimation Methods from Monocular Image and Videos: A Comprehensive Survey
Estimating depth from single RGB images and videos is of widespread interest due to its applications in many areas, including autonomous driving, 3D reconstruction, digital entertainment, and robotics. More than 500 deep learning-based papers have be
Externí odkaz:
http://arxiv.org/abs/2406.19675
Radio Frequency Interference (RFI) poses a significant challenge in radio astronomy, arising from terrestrial and celestial sources, disrupting observations conducted by radio telescopes. Addressing RFI involves intricate heuristic algorithms, manual
Externí odkaz:
http://arxiv.org/abs/2406.06075
Autor:
Javed, Sajid, Mahmood, Arif, Ganapathi, Iyyakutti Iyappan, Dharejo, Fayaz Ali, Werghi, Naoufel, Bennamoun, Mohammed
This paper proposes Comprehensive Pathology Language Image Pre-training (CPLIP), a new unsupervised technique designed to enhance the alignment of images and text in histopathology for tasks such as classification and segmentation. This methodology e
Externí odkaz:
http://arxiv.org/abs/2406.05205
While neural networks have excelled in video action recognition tasks, their black-box nature often obscures the understanding of their decision-making processes. Recent approaches used inherently interpretable models to analyze video actions in a ma
Externí odkaz:
http://arxiv.org/abs/2404.01591
Referring Video Object Segmentation (R-VOS) methods face challenges in maintaining consistent object segmentation due to temporal context variability and the presence of other visually similar objects. We propose an end-to-end R-VOS paradigm that exp
Externí odkaz:
http://arxiv.org/abs/2403.19407
Most existing weakly supervised semantic segmentation (WSSS) methods rely on Class Activation Mapping (CAM) to extract coarse class-specific localization maps using image-level labels. Prior works have commonly used an off-line heuristic thresholding
Externí odkaz:
http://arxiv.org/abs/2403.01156
Autor:
Taghipour, Ashkan, Ghahremani, Morteza, Bennamoun, Mohammed, Rekavandi, Aref Miri, Laga, Hamid, Boussaid, Farid
While latent diffusion models (LDMs) excel at creating imaginative images, they often lack precision in semantic fidelity and spatial control over where objects are generated. To address these deficiencies, we introduce the Box-it-to-Bind-it (B2B) mo
Externí odkaz:
http://arxiv.org/abs/2402.17910