Výsledky vyhledávání - "Wasim, Syed Talal"

Report

Distillation-free Scaling of Large SSMs for Images and Videos

Autor: Suleman, Hamid, Wasim, Syed Talal, Naseer, Muzammal, Gall, Juergen

State-space models (SSMs), exemplified by S4, have introduced a novel context modeling method by integrating state-space techniques into deep learning. However, they struggle with global context modeling due to their data-independent matrices. The Ma

Externí odkaz: http://arxiv.org/abs/2409.11867

Zobrazit plný text záznamu

Report

GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model

Autor: Shaker, Abdelrahman, Wasim, Syed Talal, Khan, Salman, Gall, Juergen, Khan, Fahad Shahbaz

Recent advancements in state-space models (SSMs) have showcased effective performance in modeling long-range dependencies with subquadratic complexity. However, pure SSM-based models still face challenges related to stability and achieving optimal pe

Externí odkaz: http://arxiv.org/abs/2407.13772

Zobrazit plný text záznamu

Report

Efficient Video Object Segmentation via Modulated Cross-Attention Memory

Autor: Shaker, Abdelrahman, Wasim, Syed Talal, Danelljan, Martin, Khan, Salman, Yang, Ming-Hsuan, Khan, Fahad Shahbaz

Recently, transformer-based approaches have shown promising results for semi-supervised video object segmentation. However, these approaches typically struggle on long videos due to increased GPU memory demands, as they frequently expand the memory b

Externí odkaz: http://arxiv.org/abs/2403.17937

Zobrazit plný text záznamu

Report

Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding

Autor: Wasim, Syed Talal, Naseer, Muzammal, Khan, Salman, Yang, Ming-Hsuan, Khan, Fahad Shahbaz

Video grounding aims to localize a spatio-temporal section in a video corresponding to an input text query. This paper addresses a critical limitation in current video grounding methodologies by introducing an Open-Vocabulary Spatio-Temporal Video Gr

Externí odkaz: http://arxiv.org/abs/2401.00901

Zobrazit plný text záznamu

Report

Hardware Resilience Properties of Text-Guided Image Classifiers

Autor: Wasim, Syed Talal, Soboka, Kabila Haile, Mahmoud, Abdulrahman, Khan, Salman, Brooks, David, Wei, Gu-Yeon

This paper presents a novel method to enhance the reliability of image classification models during deployment in the face of transient hardware errors. By utilizing enriched text embeddings derived from GPT-3 with question prompts per class and CLIP

Externí odkaz: http://arxiv.org/abs/2311.14062

Zobrazit plný text záznamu

Report

Self-regulating Prompts: Foundational Model Adaptation without Forgetting

Autor: Khattak, Muhammad Uzair, Wasim, Syed Talal, Naseer, Muzammal, Khan, Salman, Yang, Ming-Hsuan, Khan, Fahad Shahbaz

Prompt learning has emerged as an efficient alternative for fine-tuning foundational models, such as CLIP, for various downstream tasks. Conventionally trained using the task-specific objective, i.e., cross-entropy loss, prompts tend to overfit downs

Externí odkaz: http://arxiv.org/abs/2307.06948

Zobrazit plný text záznamu

Report

Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition

Autor: Wasim, Syed Talal, Khattak, Muhammad Uzair, Naseer, Muzammal, Khan, Salman, Shah, Mubarak, Khan, Fahad Shahbaz

Recent video recognition models utilize Transformer models for long-range spatio-temporal context modeling. Video transformer designs are based on self-attention that can model global context at a high computational cost. In comparison, convolutional

Externí odkaz: http://arxiv.org/abs/2307.06947

Zobrazit plný text záznamu

Report

Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting

Autor: Wasim, Syed Talal, Naseer, Muzammal, Khan, Salman, Khan, Fahad Shahbaz, Shah, Mubarak

Adopting contrastive image-text pretrained models like CLIP towards video classification has gained attention due to its cost-effectiveness and competitive performance. However, recent works in this area face a trade-off. Finetuning the pretrained mo

Externí odkaz: http://arxiv.org/abs/2304.03307

Zobrazit plný text záznamu

Toward Automatic Typography Analysis: Serif Classification and Font Similarities

Autor: Wasim, Syed Talal, Collaud, Romain, Défayes, Lara, Henchoz, Nicolas, Mathieu, Salzmann, Ribes, Delphine

Whether a document is of historical or contemporary significance, typography plays a crucial role in its composition. From the earliest forms of writing in Mesopotamia to the early days of modern printing, typographic techniques have evolved and tran

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::59b232be8cd78991a262b987b1ddb513

Zobrazit plný text záznamu

Analyzing Poster Collections using Automatic Serif Classification and Font Similarities

Autor: Wasim, Syed Talal, Collaud, Romain, Défayes, Lara, Henchoz, Nicolas, Slazmann, Mathieu, Ribes Lemay, Delphine

Using a digital collection composed of more than 52'000 posters from different years, designers, topics, and clients, we study the feasibility of comparing posters based on their typographic features. To this end, we explore the possibilities of trai

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::4f22ad42df5212b3ab3250a2f346fc29

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání