Zobrazeno 1 - 10
of 363
pro vyhledávání: '"Chan, Antoni B."'
Multi-view crowd localization predicts the ground locations of all people in the scene. Typical methods usually estimate the crowd density maps on the ground plane first, and then obtain the crowd locations. However, the performance of existing metho
Externí odkaz:
http://arxiv.org/abs/2409.01726
Recent deep learning-based multi-view people detection (MVD) methods have shown promising results on existing datasets. However, current methods are mainly trained and evaluated on small, single scenes with a limited number of multi-view frames and f
Externí odkaz:
http://arxiv.org/abs/2405.19943
In safety-critical applications such as medical imaging and autonomous driving, where decisions have profound implications for patient health and road safety, it is imperative to maintain both high adversarial robustness to protect against potential
Externí odkaz:
http://arxiv.org/abs/2405.08886
Precise image editing with text-to-image models has attracted increasing interest due to their remarkable generative capabilities and user-friendly nature. However, such attempts face the pivotal challenge of misalignment between the intended precise
Externí odkaz:
http://arxiv.org/abs/2404.11895
Autor:
Wu, Qiangqiang, Chan, Antoni B.
Existing deep trackers are typically trained with largescale video frames with annotated bounding boxes. However, these bounding boxes are expensive and time-consuming to annotate, in particular for large scale datasets. In this paper, we propose to
Externí odkaz:
http://arxiv.org/abs/2404.09504
Autor:
Lin, Wei, Chan, Antoni B.
Existing class-agnostic counting models typically rely on a single type of prompt, e.g., box annotations. This paper aims to establish a comprehensive prompt-based counting framework capable of generating density maps for concerned objects indicated
Externí odkaz:
http://arxiv.org/abs/2403.10236
The existing crowd counting models require extensive training data, which is time-consuming to annotate. To tackle this issue, we propose a simple yet effective crowd counting method by utilizing the Segment-Everything-Everywhere Model (SEEM), an ada
Externí odkaz:
http://arxiv.org/abs/2402.17514
Publikováno v:
ACM TOMM 2024, Codes: https://github.com/mdswyz/TCVE
Large-scale text-to-image (T2I) diffusion models have been extended for text-guided video editing, yielding impressive zero-shot video editing performance. Nonetheless, the generated videos usually show spatial irregularities and temporal inconsisten
Externí odkaz:
http://arxiv.org/abs/2308.09091
We examined whether embedding human attention knowledge into saliency-based explainable AI (XAI) methods for computer vision models could enhance their plausibility and faithfulness. We first developed new gradient-based XAI methods for object detect
Externí odkaz:
http://arxiv.org/abs/2305.03601
Autor:
Zhao, Chenyang, Chan, Antoni B.
We propose the gradient-weighted Object Detector Activation Maps (ODAM), a visualized explanation technique for interpreting the predictions of object detectors. Utilizing the gradients of detector targets flowing into the intermediate feature maps,
Externí odkaz:
http://arxiv.org/abs/2304.06354