Výsledky vyhledávání

Report

FireRedTTS: A Foundation Text-To-Speech Framework for Industry-Level Generative Speech Applications

Autor: Guo, Hao-Han, Liu, Kun, Shen, Fei-Yu, Wu, Yi-Chen, Xie, Feng-Long, Xie, Kun, Xu, Kai-Tuo

This work proposes FireRedTTS, a foundation text-to-speech framework, to meet the growing demands for personalized and diverse generative speech applications. The framework comprises three parts: data processing, foundation system, and downstream app

Externí odkaz: http://arxiv.org/abs/2409.03283

Zobrazit plný text záznamu

Report

Lesion-aware network for diabetic retinopathy diagnosis

Autor: Xia, Xue, Zhan, Kun, Fang, Yuming, Jiang, Wenhui, Shen, Fei

Deep learning brought boosts to auto diabetic retinopathy (DR) diagnosis, thus, greatly helping ophthalmologists for early disease detection, which contributes to preventing disease deterioration that may eventually lead to blindness. It has been pro

Externí odkaz: http://arxiv.org/abs/2408.07264

Zobrazit plný text záznamu

Report

SHREC: a SRE Behaviour Knowledge Graph Model for Shell Command Recommendations

Autor: Tonon, Andrea, Caglayan, Bora, Wang, MingXue, Hu, Peng, Shen, Fei, Zhang, Puchao

Publikováno v: Proceedings of the 2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 2024. p. 406-416

In IT system operations, shell commands are common command line tools used by site reliability engineers (SREs) for daily tasks, such as system configuration, package deployment, and performance optimization. The efficiency in their execution has a c

Externí odkaz: http://arxiv.org/abs/2408.05592

Zobrazit plný text záznamu

Report

Few-shot Defect Image Generation based on Consistency Modeling

Autor: Shi, Qingfeng, Wei, Jing, Shen, Fei, Zhang, Zhengtao

Image generation can solve insufficient labeled data issues in defect detection. Most defect generation methods are only trained on a single product without considering the consistencies among multiple products, leading to poor quality and diversity

Externí odkaz: http://arxiv.org/abs/2408.00372

Zobrazit plný text záznamu

Report

MSP-MVS: Multi-granularity Segmentation Prior Guided Multi-View Stereo

Autor: Yuan, Zhenlong, Liu, Cong, Shen, Fei, Li, Zhaoxin, Mao, Tianlu, Wang, Zhaoqi

Reconstructing textureless areas in MVS poses challenges due to the absence of reliable pixel correspondences within fixed patch. Although certain methods employ patch deformation to expand the receptive field, their patches mistakenly skip depth edg

Externí odkaz: http://arxiv.org/abs/2407.19323

Zobrazit plný text záznamu

Report

IMAGDressing-v1: Customizable Virtual Dressing

Autor: Shen, Fei, Jiang, Xin, He, Xin, Ye, Hu, Wang, Cong, Du, Xiaoyu, Li, Zechao, Tang, Jinhui

Latest advances have achieved realistic virtual try-on (VTON) through localized garment inpainting using latent diffusion models, significantly enhancing consumers' online shopping experience. However, existing VTON technologies neglect the need for

Externí odkaz: http://arxiv.org/abs/2407.12705

Zobrazit plný text záznamu

Report

VCP-CLIP: A visual context prompting model for zero-shot anomaly segmentation

Autor: Qu, Zhen, Tao, Xian, Prasad, Mukesh, Shen, Fei, Zhang, Zhengtao, Gong, Xinyi, Ding, Guiguang

Recently, large-scale vision-language models such as CLIP have demonstrated immense potential in zero-shot anomaly segmentation (ZSAS) task, utilizing a unified model to directly detect anomalies on any unseen product with painstakingly crafted text

Externí odkaz: http://arxiv.org/abs/2407.12276

Zobrazit plný text záznamu

Report

Boosting Consistency in Story Visualization with Rich-Contextual Conditional Diffusion Models

Autor: Shen, Fei, Ye, Hu, Liu, Sibo, Zhang, Jun, Wang, Cong, Han, Xiao, Yang, Wei

Recent research showcases the considerable potential of conditional diffusion models for generating consistent stories. However, current methods, which predominantly generate stories in an autoregressive and excessively caption-dependent manner, ofte

Externí odkaz: http://arxiv.org/abs/2407.02482

Zobrazit plný text záznamu

Report

V-Express: Conditional Dropout for Progressive Training of Portrait Video Generation

Autor: Wang, Cong, Tian, Kuan, Zhang, Jun, Guan, Yonghang, Luo, Feng, Shen, Fei, Jiang, Zhiwei, Gu, Qing, Han, Xiao, Yang, Wei

In the field of portrait video generation, the use of single images to generate portrait videos has become increasingly prevalent. A common approach involves leveraging generative models to enhance adapters for controlled generation. However, control

Externí odkaz: http://arxiv.org/abs/2406.02511

Zobrazit plný text záznamu

Report

Ensembling Diffusion Models via Adaptive Feature Aggregation

Autor: Wang, Cong, Tian, Kuan, Guan, Yonghang, Zhang, Jun, Jiang, Zhiwei, Shen, Fei, Han, Xiao, Gu, Qing, Yang, Wei

The success of the text-guided diffusion model has inspired the development and release of numerous powerful diffusion models within the open-source community. These models are typically fine-tuned on various expert datasets, showcasing diverse denoi

Externí odkaz: http://arxiv.org/abs/2405.17082

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání