Zobrazeno 1 - 10
of 92
pro vyhledávání: '"Zhang, ShengChuan"'
Open-vocabulary panoptic segmentation aims to segment and classify everything in diverse scenes across an unbounded vocabulary. Existing methods typically employ two-stage or single-stage framework. The two-stage framework involves cropping the image
Externí odkaz:
http://arxiv.org/abs/2412.08628
Due to the scarcity and unpredictable nature of defect samples, industrial anomaly detection (IAD) predominantly employs unsupervised learning. However, all unsupervised IAD methods face a common challenge: the inherent bias in normal samples, which
Externí odkaz:
http://arxiv.org/abs/2412.08189
Autor:
Lai, Xunfa, Yang, Zhiyu, Hu, Jie, Zhang, Shengchuan, Cao, Liujuan, Jiang, Guannan, Wang, Zhiyu, Zhang, Songan, Ji, Rongrong
Existing camouflaged object detection~(COD) methods depend heavily on large-scale pixel-level annotations.However, acquiring such annotations is laborious due to the inherent camouflage characteristics of the objects.Semi-supervised learning offers a
Externí odkaz:
http://arxiv.org/abs/2408.08050
The Segment Anything Model (SAM) has advanced interactive segmentation but is limited by the high computational cost on high-resolution images. This requires downsampling to meet GPU constraints, sacrificing the fine-grained details needed for high-p
Externí odkaz:
http://arxiv.org/abs/2407.02109
Autor:
Gao, Timin, Pan, Wensheng, Zhang, Yan, Zhao, Sicheng, Zhang, Shengchuan, Zheng, Xiawu, Li, Ke, Cao, Liujuan, Ji, Rongrong
Contrastive learning has considerably advanced the field of Image Quality Assessment (IQA), emerging as a widely adopted technique. The core mechanism of contrastive learning involves minimizing the distance between quality-similar (positive) example
Externí odkaz:
http://arxiv.org/abs/2406.19247
Autor:
Li, Xinyang, Lai, Zhangyu, Xu, Linning, Qu, Yansong, Cao, Liujuan, Zhang, Shengchuan, Dai, Bo, Ji, Rongrong
Recent advancements in 3D generation have leveraged synthetic datasets with ground truth 3D assets and predefined cameras. However, the potential of adopting real-world datasets, which can produce significantly more realistic 3D scenes, remains large
Externí odkaz:
http://arxiv.org/abs/2406.17601
Autor:
Huang, You, Lan, Zongyu, Cao, Liujuan, Lin, Xianming, Zhang, Shengchuan, Jiang, Guannan, Ji, Rongrong
The Segment Anything Model (SAM) marks a notable milestone in segmentation models, highlighted by its robust zero-shot capabilities and ability to handle diverse prompts. SAM follows a pipeline that separates interactive segmentation into image prepr
Externí odkaz:
http://arxiv.org/abs/2405.18706
Autor:
Qu, Yansong, Dai, Shaohui, Li, Xinyang, Lin, Jianghang, Cao, Liujuan, Zhang, Shengchuan, Ji, Rongrong
3D open-vocabulary scene understanding, crucial for advancing augmented reality and robotic applications, involves interpreting and locating specific regions within a 3D space as directed by natural language instructions. To this end, we introduce GO
Externí odkaz:
http://arxiv.org/abs/2405.17596
Autor:
Li, Xinyang, Lai, Zhangyu, Xu, Linning, Guo, Jianfei, Cao, Liujuan, Zhang, Shengchuan, Dai, Bo, Ji, Rongrong
We present Dual3D, a novel text-to-3D generation framework that generates high-quality 3D assets from texts in only $1$ minute.The key component is a dual-mode multi-view latent diffusion model. Given the noisy multi-view latents, the 2D mode can eff
Externí odkaz:
http://arxiv.org/abs/2405.09874
Autor:
Gao, Timin, Chen, Peixian, Zhang, Mengdan, Fu, Chaoyou, Shen, Yunhang, Zhang, Yan, Zhang, Shengchuan, Zheng, Xiawu, Sun, Xing, Cao, Liujuan, Ji, Rongrong
With the advent of large language models(LLMs) enhanced by the chain-of-thought(CoT) methodology, visual reasoning problem is usually decomposed into manageable sub-tasks and tackled sequentially with various external tools. However, such a paradigm
Externí odkaz:
http://arxiv.org/abs/2404.16033