Zobrazeno 1 - 10
of 16
pro vyhledávání: '"Wasim, Syed Talal"'
State-space models (SSMs), exemplified by S4, have introduced a novel context modeling method by integrating state-space techniques into deep learning. However, they struggle with global context modeling due to their data-independent matrices. The Ma
Externí odkaz:
http://arxiv.org/abs/2409.11867
Recent advancements in state-space models (SSMs) have showcased effective performance in modeling long-range dependencies with subquadratic complexity. However, pure SSM-based models still face challenges related to stability and achieving optimal pe
Externí odkaz:
http://arxiv.org/abs/2407.13772
Autor:
Shaker, Abdelrahman, Wasim, Syed Talal, Danelljan, Martin, Khan, Salman, Yang, Ming-Hsuan, Khan, Fahad Shahbaz
Recently, transformer-based approaches have shown promising results for semi-supervised video object segmentation. However, these approaches typically struggle on long videos due to increased GPU memory demands, as they frequently expand the memory b
Externí odkaz:
http://arxiv.org/abs/2403.17937
Video grounding aims to localize a spatio-temporal section in a video corresponding to an input text query. This paper addresses a critical limitation in current video grounding methodologies by introducing an Open-Vocabulary Spatio-Temporal Video Gr
Externí odkaz:
http://arxiv.org/abs/2401.00901
Autor:
Wasim, Syed Talal, Soboka, Kabila Haile, Mahmoud, Abdulrahman, Khan, Salman, Brooks, David, Wei, Gu-Yeon
This paper presents a novel method to enhance the reliability of image classification models during deployment in the face of transient hardware errors. By utilizing enriched text embeddings derived from GPT-3 with question prompts per class and CLIP
Externí odkaz:
http://arxiv.org/abs/2311.14062
Autor:
Khattak, Muhammad Uzair, Wasim, Syed Talal, Naseer, Muzammal, Khan, Salman, Yang, Ming-Hsuan, Khan, Fahad Shahbaz
Prompt learning has emerged as an efficient alternative for fine-tuning foundational models, such as CLIP, for various downstream tasks. Conventionally trained using the task-specific objective, i.e., cross-entropy loss, prompts tend to overfit downs
Externí odkaz:
http://arxiv.org/abs/2307.06948
Autor:
Wasim, Syed Talal, Khattak, Muhammad Uzair, Naseer, Muzammal, Khan, Salman, Shah, Mubarak, Khan, Fahad Shahbaz
Recent video recognition models utilize Transformer models for long-range spatio-temporal context modeling. Video transformer designs are based on self-attention that can model global context at a high computational cost. In comparison, convolutional
Externí odkaz:
http://arxiv.org/abs/2307.06947
Adopting contrastive image-text pretrained models like CLIP towards video classification has gained attention due to its cost-effectiveness and competitive performance. However, recent works in this area face a trade-off. Finetuning the pretrained mo
Externí odkaz:
http://arxiv.org/abs/2304.03307
Autor:
Wasim, Syed Talal, Collaud, Romain, Défayes, Lara, Henchoz, Nicolas, Mathieu, Salzmann, Ribes, Delphine
Whether a document is of historical or contemporary significance, typography plays a crucial role in its composition. From the earliest forms of writing in Mesopotamia to the early days of modern printing, typographic techniques have evolved and tran
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::59b232be8cd78991a262b987b1ddb513
Autor:
Wasim, Syed Talal, Collaud, Romain, Défayes, Lara, Henchoz, Nicolas, Slazmann, Mathieu, Ribes Lemay, Delphine
Using a digital collection composed of more than 52'000 posters from different years, designers, topics, and clients, we study the feasibility of comparing posters based on their typographic features. To this end, we explore the possibilities of trai
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::4f22ad42df5212b3ab3250a2f346fc29