Výsledky vyhledávání

Akademický článek

The generation of femtosecond optical vortex beams with megawatt powers directly from a fiber based Mamyshev oscillator

Autor: Lin Di, Feng Yutong, Ren Zhengqi, Richardson David J.

Publikováno v: Nanophotonics, Vol 11, Iss 4, Pp 847-854 (2021)

Numerous approaches have been developed to generate optical vortex beams carrying orbital angular momentum (OAM) over the past decades, but the direct intracavity generation of such beams with practical output powers in the femtosecond regime still r

Externí odkaz: https://doaj.org/article/e29facc98b044a8a97f9fb9f3d1a672d

Zobrazit plný text záznamu

Report

Mimir: Improving Video Diffusion Models for Precise Text Understanding

Autor: Tan, Shuai, Gong, Biao, Feng, Yutong, Zheng, Kecheng, Zheng, Dandan, Shi, Shuwei, Shen, Yujun, Chen, Jingdong, Yang, Ming

Text serves as the key control signal in video generation due to its narrative nature. To render text descriptions into video clips, current video diffusion models borrow features from text encoders yet struggle with limited text comprehension. The r

Externí odkaz: http://arxiv.org/abs/2412.03085

Zobrazit plný text záznamu

Report

In-Context LoRA for Diffusion Transformers

Autor: Huang, Lianghua, Wang, Wei, Wu, Zhi-Fan, Shi, Yupeng, Dou, Huanzhang, Liang, Chen, Feng, Yutong, Liu, Yu, Zhou, Jingren

Recent research arXiv:2410.15027 has explored the use of diffusion transformers (DiTs) for task-agnostic image generation by simply concatenating attention tokens across images. However, despite substantial computational resources, the fidelity of th

Externí odkaz: http://arxiv.org/abs/2410.23775

Zobrazit plný text záznamu

Report

Group Diffusion Transformers are Unsupervised Multitask Learners

Autor: Huang, Lianghua, Wang, Wei, Wu, Zhi-Fan, Dou, Huanzhang, Shi, Yupeng, Feng, Yutong, Liang, Chen, Liu, Yu, Zhou, Jingren

While large language models (LLMs) have revolutionized natural language processing with their task-agnostic capabilities, visual generation tasks such as image translation, style transfer, and character customization still rely heavily on supervised,

Externí odkaz: http://arxiv.org/abs/2410.15027

Zobrazit plný text záznamu

Report

DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control

Autor: Wei, Yujie, Zhang, Shiwei, Yuan, Hangjie, Wang, Xiang, Qiu, Haonan, Zhao, Rui, Feng, Yutong, Liu, Feng, Huang, Zhizhong, Ye, Jiaxin, Zhang, Yingya, Shan, Hongming

Recent advances in customized video generation have enabled users to create videos tailored to both specific subjects and motion trajectories. However, existing methods often require complicated test-time fine-tuning and struggle with balancing subje

Externí odkaz: http://arxiv.org/abs/2410.13830

Zobrazit plný text záznamu

Report

Zero-shot Image Editing with Reference Imitation

Autor: Chen, Xi, Feng, Yutong, Chen, Mengting, Wang, Yiyang, Zhang, Shilong, Liu, Yu, Shen, Yujun, Zhao, Hengshuang

Image editing serves as a practical yet challenging task considering the diverse demands from users, where one of the hardest parts is to precisely describe how the edited image should look like. In this work, we present a new form of editing, termed

Externí odkaz: http://arxiv.org/abs/2406.07547

Zobrazit plný text záznamu

Report

FlashFace: Human Image Personalization with High-fidelity Identity Preservation

Autor: Zhang, Shilong, Huang, Lianghua, Chen, Xi, Zhang, Yifei, Wu, Zhi-Fan, Feng, Yutong, Wang, Wei, Shen, Yujun, Liu, Yu, Luo, Ping

This work presents FlashFace, a practical tool with which users can easily personalize their own photos on the fly by providing one or a few reference face images and a text prompt. Our approach is distinguishable from existing human photo customizat

Externí odkaz: http://arxiv.org/abs/2403.17008

Zobrazit plný text záznamu

Report

Spatio-Temporal Field Neural Networks for Air Quality Inference

Autor: Feng, Yutong, Wang, Qiongyan, Xia, Yutong, Huang, Junlin, Zhong, Siru, Liang, Yuxuan

The air quality inference problem aims to utilize historical data from a limited number of observation sites to infer the air quality index at an unknown location. Considering the sparsity of data due to the high maintenance cost of the stations, goo

Externí odkaz: http://arxiv.org/abs/2403.02354

Zobrazit plný text záznamu

Report

LivePhoto: Real Image Animation with Text-guided Motion Control

Autor: Chen, Xi, Liu, Zhiheng, Chen, Mengting, Feng, Yutong, Liu, Yu, Shen, Yujun, Zhao, Hengshuang

Despite the recent progress in text-to-video generation, existing studies usually overlook the issue that only spatial contents but not temporal motions in synthesized videos are under the control of text. Towards such a challenge, this work presents

Externí odkaz: http://arxiv.org/abs/2312.02928

Zobrazit plný text záznamu

Report

Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following

Autor: Feng, Yutong, Gong, Biao, Chen, Di, Shen, Yujun, Liu, Yu, Zhou, Jingren

Existing text-to-image (T2I) diffusion models usually struggle in interpreting complex prompts, especially those with quantity, object-attribute binding, and multi-subject descriptions. In this work, we introduce a semantic panel as the middleware in

Externí odkaz: http://arxiv.org/abs/2311.17002

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání