Zobrazeno 1 - 10
of 7 772
pro vyhledávání: '"Schiele, A."'
Autor:
Wu, Yongliang, Hu, Xinting, Sun, Yuyang, Zhou, Yizhou, Zhu, Wenbo, Rao, Fengyun, Schiele, Bernt, Yang, Xu
Video Large Language Models (Vid-LLMs) have made remarkable advancements in comprehending video content for QA dialogue. However, they struggle to extend this visual understanding to tasks requiring precise temporal localization, known as Video Tempo
Externí odkaz:
http://arxiv.org/abs/2411.10332
B-cos Networks have been shown to be effective for obtaining highly human interpretable explanations of model decisions by architecturally enforcing stronger alignment between inputs and weight. B-cos variants of convolutional networks (CNNs) and vis
Externí odkaz:
http://arxiv.org/abs/2411.00715
Autor:
Wang, Haiyang, Fan, Yue, Naeem, Muhammad Ferjad, Xian, Yongqin, Lenssen, Jan Eric, Wang, Liwei, Tombari, Federico, Schiele, Bernt
Transformers have become the predominant architecture in foundation models due to their excellent performance across various domains. However, the substantial cost of scaling these models remains a significant concern. This problem arises primarily f
Externí odkaz:
http://arxiv.org/abs/2410.23168
This study addresses the deployment challenges of integer-only quantized Transformers on resource-constrained embedded FPGAs (Xilinx Spartan-7 XC7S15). We enhanced the flexibility of our VHDL template by introducing a selectable resource type for sto
Externí odkaz:
http://arxiv.org/abs/2410.03294
Multiple object tracking in complex scenarios - such as coordinated dance performances, team sports, or dynamic animal groups - presents unique challenges. In these settings, objects frequently move in coordinated patterns, occlude each other, and ex
Externí odkaz:
http://arxiv.org/abs/2410.01806
The supervision of state-of-the-art multiple object tracking (MOT) methods requires enormous annotation efforts to provide bounding boxes for all frames of all videos, and instance IDs to associate them through time. To this end, we introduce Walker,
Externí odkaz:
http://arxiv.org/abs/2409.17221
ElasticAI: Creating and Deploying Energy-Efficient Deep Learning Accelerator for Pervasive Computing
Deploying Deep Learning (DL) on embedded end devices is a scorching trend in pervasive computing. Since most Microcontrollers on embedded devices have limited computing power, it is necessary to add a DL accelerator. Embedded Field Programmable Gate
Externí odkaz:
http://arxiv.org/abs/2409.09044
Autor:
Ling, Tianheng, Schiele, Gregor
Artificial Intelligence (AI) models for time-series in pervasive computing keep getting larger and more complicated. The Transformer model is by far the most compelling of these AI models. However, it is difficult to obtain the desired performance wh
Externí odkaz:
http://arxiv.org/abs/2408.16495
In this work, we introduce Scribbles for All, a label and training data generation algorithm for semantic segmentation trained on scribble labels. Training or fine-tuning semantic segmentation models with weak supervision has become an important topi
Externí odkaz:
http://arxiv.org/abs/2408.12489
Recent approaches have shown that large-scale vision-language models such as CLIP can improve semantic segmentation performance. These methods typically aim for pixel-level vision-language alignment, but often rely on low resolution image features fr
Externí odkaz:
http://arxiv.org/abs/2407.21654