Zobrazeno 1 - 10
of 3 604
pro vyhledávání: '"Shen, Fei"'
This work proposes FireRedTTS, a foundation text-to-speech framework, to meet the growing demands for personalized and diverse generative speech applications. The framework comprises three parts: data processing, foundation system, and downstream app
Externí odkaz:
http://arxiv.org/abs/2409.03283
Deep learning brought boosts to auto diabetic retinopathy (DR) diagnosis, thus, greatly helping ophthalmologists for early disease detection, which contributes to preventing disease deterioration that may eventually lead to blindness. It has been pro
Externí odkaz:
http://arxiv.org/abs/2408.07264
Publikováno v:
Proceedings of the 2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 2024. p. 406-416
In IT system operations, shell commands are common command line tools used by site reliability engineers (SREs) for daily tasks, such as system configuration, package deployment, and performance optimization. The efficiency in their execution has a c
Externí odkaz:
http://arxiv.org/abs/2408.05592
Image generation can solve insufficient labeled data issues in defect detection. Most defect generation methods are only trained on a single product without considering the consistencies among multiple products, leading to poor quality and diversity
Externí odkaz:
http://arxiv.org/abs/2408.00372
Reconstructing textureless areas in MVS poses challenges due to the absence of reliable pixel correspondences within fixed patch. Although certain methods employ patch deformation to expand the receptive field, their patches mistakenly skip depth edg
Externí odkaz:
http://arxiv.org/abs/2407.19323
Latest advances have achieved realistic virtual try-on (VTON) through localized garment inpainting using latent diffusion models, significantly enhancing consumers' online shopping experience. However, existing VTON technologies neglect the need for
Externí odkaz:
http://arxiv.org/abs/2407.12705
Recently, large-scale vision-language models such as CLIP have demonstrated immense potential in zero-shot anomaly segmentation (ZSAS) task, utilizing a unified model to directly detect anomalies on any unseen product with painstakingly crafted text
Externí odkaz:
http://arxiv.org/abs/2407.12276
Recent research showcases the considerable potential of conditional diffusion models for generating consistent stories. However, current methods, which predominantly generate stories in an autoregressive and excessively caption-dependent manner, ofte
Externí odkaz:
http://arxiv.org/abs/2407.02482
Autor:
Wang, Cong, Tian, Kuan, Zhang, Jun, Guan, Yonghang, Luo, Feng, Shen, Fei, Jiang, Zhiwei, Gu, Qing, Han, Xiao, Yang, Wei
In the field of portrait video generation, the use of single images to generate portrait videos has become increasingly prevalent. A common approach involves leveraging generative models to enhance adapters for controlled generation. However, control
Externí odkaz:
http://arxiv.org/abs/2406.02511
Autor:
Wang, Cong, Tian, Kuan, Guan, Yonghang, Zhang, Jun, Jiang, Zhiwei, Shen, Fei, Han, Xiao, Gu, Qing, Yang, Wei
The success of the text-guided diffusion model has inspired the development and release of numerous powerful diffusion models within the open-source community. These models are typically fine-tuned on various expert datasets, showcasing diverse denoi
Externí odkaz:
http://arxiv.org/abs/2405.17082