Výsledky vyhledávání - "Wu, Yunsheng"

Report

FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on

Autor: Jiang, Boyuan, Hu, Xiaobin, Luo, Donghao, He, Qingdong, Xu, Chengming, Peng, Jinlong, Zhang, Jiangning, Wang, Chengjie, Wu, Yunsheng, Fu, Yanwei

Although image-based virtual try-on has made considerable progress, emerging approaches still encounter challenges in producing high-fidelity and robust fitting images across diverse scenarios. These methods often struggle with issues such as texture

Externí odkaz: http://arxiv.org/abs/2411.10499

Zobrazit plný text záznamu

Report

VITA: Towards Open-Source Interactive Omni Multimodal LLM

Autor: Fu, Chaoyou, Lin, Haojia, Long, Zuwei, Shen, Yunhang, Zhao, Meng, Zhang, Yifan, Dong, Shaoqi, Wang, Xiong, Yin, Di, Ma, Long, Zheng, Xiawu, He, Ran, Ji, Rongrong, Wu, Yunsheng, Shan, Caifeng, Sun, Xing

The remarkable multimodal capabilities and interactive experience of GPT-4o underscore their necessity in practical applications, yet open-source models rarely excel in both areas. In this paper, we introduce VITA, the first-ever open-source Multimod

Externí odkaz: http://arxiv.org/abs/2408.05211

Zobrazit plný text záznamu

Report

Oracle Bone Inscriptions Multi-modal Dataset

Oracle bone inscriptions(OBI) is the earliest developed writing system in China, bearing invaluable written exemplifications of early Shang history and paleography. However, the task of deciphering OBI, in the current climate of the scholarship, can

Externí odkaz: http://arxiv.org/abs/2407.03900

Zobrazit plný text záznamu

Report

SlerpFace: Face Template Protection via Spherical Linear Interpolation

Autor: Zhong, Zhizhou, Mi, Yuxi, Huang, Yuge, Xu, Jianqing, Mu, Guodong, Ding, Shouhong, Zhang, Jingyun, Guo, Rizen, Wu, Yunsheng, Zhou, Shuigeng

Contemporary face recognition systems use feature templates extracted from face images to identify persons. To enhance privacy, face template protection techniques are widely employed to conceal sensitive identity and appearance information stored in

Externí odkaz: http://arxiv.org/abs/2407.03043

Zobrazit plný text záznamu

Report

DF40: Toward Next-Generation Deepfake Detection

Autor: Yan, Zhiyuan, Yao, Taiping, Chen, Shen, Zhao, Yandan, Fu, Xinghe, Zhu, Junwei, Luo, Donghao, Wang, Chengjie, Ding, Shouhong, Wu, Yunsheng, Yuan, Li

We propose a new comprehensive benchmark to revolutionize the current deepfake detection field to the next generation. Predominantly, existing works identify top-notch detection algorithms and models by adhering to the common practice: training detec

Externí odkaz: http://arxiv.org/abs/2406.13495

Zobrazit plný text záznamu

Report

M3DM-NR: RGB-3D Noisy-Resistant Industrial Anomaly Detection via Multimodal Denoising

Autor: Wang, Chengjie, Zhu, Haokun, Peng, Jinlong, Wang, Yue, Yi, Ran, Wu, Yunsheng, Ma, Lizhuang, Zhang, Jiangning

Existing industrial anomaly detection methods primarily concentrate on unsupervised learning with pristine RGB images. Yet, both RGB and 3D data are crucial for anomaly detection, and the datasets are seldom completely clean in practical scenarios. T

Externí odkaz: http://arxiv.org/abs/2406.02263

Zobrazit plný text záznamu

Report

Deepfake Generation and Detection: A Benchmark and Survey

Autor: Pei, Gan, Zhang, Jiangning, Hu, Menghan, Zhang, Zhenyu, Wang, Chengjie, Wu, Yunsheng, Zhai, Guangtao, Yang, Jian, Shen, Chunhua, Tao, Dacheng

Deepfake is a technology dedicated to creating highly realistic facial images and videos under specific conditions, which has significant application potential in fields such as entertainment, movie production, digital human creation, to name a few.

Externí odkaz: http://arxiv.org/abs/2403.17881

Zobrazit plný text záznamu

Report

TexDreamer: Towards Zero-Shot High-Fidelity 3D Human Texture Generation

Autor: Liu, Yufei, Zhu, Junwei, Tang, Junshu, Zhang, Shijie, Zhang, Jiangning, Cao, Weijian, Wang, Chengjie, Wu, Yunsheng, Huang, Dongjin

Texturing 3D humans with semantic UV maps remains a challenge due to the difficulty of acquiring reasonably unfolded UV. Despite recent text-to-3D advancements in supervising multi-view renderings using large text-to-image (T2I) models, issues persis

Externí odkaz: http://arxiv.org/abs/2403.12906

Zobrazit plný text záznamu

Report

Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability

Autor: Qian, Xuelin, Wang, Yu, Luo, Simian, Zhang, Yinda, Tai, Ying, Zhang, Zhenyu, Wang, Chengjie, Xue, Xiangyang, Zhao, Bo, Huang, Tiejun, Wu, Yunsheng, Fu, Yanwei

Auto-regressive models have achieved impressive results in 2D image generation by modeling joint distributions in grid space. In this paper, we extend auto-regressive models to 3D domains, and seek a stronger ability of 3D shape generation by improvi

Externí odkaz: http://arxiv.org/abs/2402.12225

Zobrazit plný text záznamu

Report

UniM-OV3D: Uni-Modality Open-Vocabulary 3D Scene Understanding with Fine-Grained Feature Representation

Autor: He, Qingdong, Peng, Jinlong, Jiang, Zhengkai, Wu, Kai, Ji, Xiaozhong, Zhang, Jiangning, Wang, Yabiao, Wang, Chengjie, Chen, Mingang, Wu, Yunsheng

3D open-vocabulary scene understanding aims to recognize arbitrary novel categories beyond the base label space. However, existing works not only fail to fully utilize all the available modal information in the 3D domain but also lack sufficient gran

Externí odkaz: http://arxiv.org/abs/2401.11395

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání