Zobrazeno 1 - 10
of 271
pro vyhledávání: '"Feng Yutong"'
Publikováno v:
Nanophotonics, Vol 11, Iss 4, Pp 847-854 (2021)
Numerous approaches have been developed to generate optical vortex beams carrying orbital angular momentum (OAM) over the past decades, but the direct intracavity generation of such beams with practical output powers in the femtosecond regime still r
Externí odkaz:
https://doaj.org/article/e29facc98b044a8a97f9fb9f3d1a672d
Autor:
Tan, Shuai, Gong, Biao, Feng, Yutong, Zheng, Kecheng, Zheng, Dandan, Shi, Shuwei, Shen, Yujun, Chen, Jingdong, Yang, Ming
Text serves as the key control signal in video generation due to its narrative nature. To render text descriptions into video clips, current video diffusion models borrow features from text encoders yet struggle with limited text comprehension. The r
Externí odkaz:
http://arxiv.org/abs/2412.03085
Autor:
Huang, Lianghua, Wang, Wei, Wu, Zhi-Fan, Shi, Yupeng, Dou, Huanzhang, Liang, Chen, Feng, Yutong, Liu, Yu, Zhou, Jingren
Recent research arXiv:2410.15027 has explored the use of diffusion transformers (DiTs) for task-agnostic image generation by simply concatenating attention tokens across images. However, despite substantial computational resources, the fidelity of th
Externí odkaz:
http://arxiv.org/abs/2410.23775
Autor:
Huang, Lianghua, Wang, Wei, Wu, Zhi-Fan, Dou, Huanzhang, Shi, Yupeng, Feng, Yutong, Liang, Chen, Liu, Yu, Zhou, Jingren
While large language models (LLMs) have revolutionized natural language processing with their task-agnostic capabilities, visual generation tasks such as image translation, style transfer, and character customization still rely heavily on supervised,
Externí odkaz:
http://arxiv.org/abs/2410.15027
Autor:
Wei, Yujie, Zhang, Shiwei, Yuan, Hangjie, Wang, Xiang, Qiu, Haonan, Zhao, Rui, Feng, Yutong, Liu, Feng, Huang, Zhizhong, Ye, Jiaxin, Zhang, Yingya, Shan, Hongming
Recent advances in customized video generation have enabled users to create videos tailored to both specific subjects and motion trajectories. However, existing methods often require complicated test-time fine-tuning and struggle with balancing subje
Externí odkaz:
http://arxiv.org/abs/2410.13830
Autor:
Chen, Xi, Feng, Yutong, Chen, Mengting, Wang, Yiyang, Zhang, Shilong, Liu, Yu, Shen, Yujun, Zhao, Hengshuang
Image editing serves as a practical yet challenging task considering the diverse demands from users, where one of the hardest parts is to precisely describe how the edited image should look like. In this work, we present a new form of editing, termed
Externí odkaz:
http://arxiv.org/abs/2406.07547
Autor:
Zhang, Shilong, Huang, Lianghua, Chen, Xi, Zhang, Yifei, Wu, Zhi-Fan, Feng, Yutong, Wang, Wei, Shen, Yujun, Liu, Yu, Luo, Ping
This work presents FlashFace, a practical tool with which users can easily personalize their own photos on the fly by providing one or a few reference face images and a text prompt. Our approach is distinguishable from existing human photo customizat
Externí odkaz:
http://arxiv.org/abs/2403.17008
The air quality inference problem aims to utilize historical data from a limited number of observation sites to infer the air quality index at an unknown location. Considering the sparsity of data due to the high maintenance cost of the stations, goo
Externí odkaz:
http://arxiv.org/abs/2403.02354
Despite the recent progress in text-to-video generation, existing studies usually overlook the issue that only spatial contents but not temporal motions in synthesized videos are under the control of text. Towards such a challenge, this work presents
Externí odkaz:
http://arxiv.org/abs/2312.02928
Existing text-to-image (T2I) diffusion models usually struggle in interpreting complex prompts, especially those with quantity, object-attribute binding, and multi-subject descriptions. In this work, we introduce a semantic panel as the middleware in
Externí odkaz:
http://arxiv.org/abs/2311.17002