Zobrazeno 1 - 10
of 8 240
pro vyhledávání: '"TANG, Ming"'
Visual Anomaly Detection (VAD) aims to identify abnormal samples in images that deviate from normal patterns, covering multiple domains, including industrial, logical, and medical fields. Due to the domain gaps between these fields, existing VAD meth
Externí odkaz:
http://arxiv.org/abs/2412.03342
Overfitting has long been stigmatized as detrimental to model performance, especially in the context of anomaly detection. Our work challenges this conventional view by introducing a paradigm shift, recasting overfitting as a controllable and strateg
Externí odkaz:
http://arxiv.org/abs/2412.00560
Continual learning (CL) is crucial for language models to dynamically adapt to the evolving real-world demands. To mitigate the catastrophic forgetting problem in CL, data replay has been proven a simple and effective strategy, and the subsequent dat
Externí odkaz:
http://arxiv.org/abs/2411.06171
Large Multimodal Models (LMMs) have achieved significant breakthroughs in various vision-language and vision-centric tasks based on auto-regressive modeling. However, these models typically focus on either vision-centric tasks, such as visual groundi
Externí odkaz:
http://arxiv.org/abs/2410.16163
Autor:
Kou, Wei-Bin, Lin, Qingfeng, Tang, Ming, Ye, Rongguang, Wang, Shuai, Zhu, Guangxu, Wu, Yik-Chung
Street Scene Semantic Understanding (denoted as TriSU) is a complex task for autonomous driving (AD). However, inference model trained from data in a particular geographical region faces poor generalization when applied in other regions due to inter-
Externí odkaz:
http://arxiv.org/abs/2409.19560
In the realm of emerging real-time networked applications like cyber-physical systems (CPS), the Age of Information (AoI) has merged as a pivotal metric for evaluating the timeliness. To meet the high computational demands, such as those in intellige
Externí odkaz:
http://arxiv.org/abs/2409.16832
We report on broadband generation based on noise-like pulse (NLP) fiber lasers at 1.55 {\mu}m and 1.06 {\mu}m, respectively. The 1.55 {\mu}m laser system can generate a broadband spectrum with a 20 dB bandwidth of up to 205 nm, while the 1.06 {\mu}m
Externí odkaz:
http://arxiv.org/abs/2409.15115
Autor:
Vu, Tuan-Hung, Valle, Eduardo, Bursuc, Andrei, Kerssies, Tommie, de Geus, Daan, Dubbelman, Gijs, Qian, Long, Zhu, Bingke, Chen, Yingying, Tang, Ming, Wang, Jinqiao, Vojíř, Tomáš, Šochman, Jan, Matas, Jiří, Smith, Michael, Ferrie, Frank, Basu, Shamik, Sakaridis, Christos, Van Gool, Luc
We propose the unified BRAVO challenge to benchmark the reliability of semantic segmentation models under realistic perturbations and unknown out-of-distribution (OOD) scenarios. We define two categories of reliability: (1) semantic reliability, whic
Externí odkaz:
http://arxiv.org/abs/2409.15107
Autor:
Kou, Wei-Bin, Zhu, Guangxu, Ye, Rongguang, Lin, Qingfeng, Ren, Zeyi, Tang, Ming, Wu, Yik-Chung
Various adverse weather conditions pose a significant challenge to autonomous driving (AD) street scene semantic understanding (segmentation). A common strategy is to minimize the disparity between images captured in clear and adverse weather conditi
Externí odkaz:
http://arxiv.org/abs/2409.14737
Pretrained vision-language models (VLMs), \eg CLIP, are increasingly used to bridge the gap between open- and close-vocabulary recognition in open-vocabulary image segmentation. As VLMs are generally pretrained with low-resolution images (e.g. $224\t
Externí odkaz:
http://arxiv.org/abs/2408.14776