Výsledky vyhledávání - "Vajda, Peter"

Report

Pixel-Space Post-Training of Latent Diffusion Models

Autor: Zhang, Christina, Motwani, Simran, Yu, Matthew, Hou, Ji, Juefei-Xu, Felix, Tsai, Sam, Vajda, Peter, He, Zijian, Wang, Jialiang

Latent diffusion models (LDMs) have made significant advancements in the field of image generation in recent years. One major advantage of LDMs is their ability to operate in a compressed latent space, allowing for more efficient training and deploym

Externí odkaz: http://arxiv.org/abs/2409.17565

Zobrazit plný text záznamu

Report

Imagine yourself: Tuning-Free Personalized Image Generation

Autor: He, Zecheng, Sun, Bo, Juefei-Xu, Felix, Ma, Haoyu, Ramchandani, Ankit, Cheung, Vincent, Shah, Siddharth, Kalia, Anmol, Subramanyam, Harihar, Zareian, Alireza, Chen, Li, Jain, Ankit, Zhang, Ning, Zhang, Peizhao, Sumbaly, Roshan, Vajda, Peter, Sinha, Animesh

Diffusion models have demonstrated remarkable efficacy across various image-to-image tasks. In this research, we introduce Imagine yourself, a state-of-the-art model designed for personalized image generation. Unlike conventional tuning-based persona

Externí odkaz: http://arxiv.org/abs/2409.13346

Zobrazit plný text záznamu

Report

Imagine Flash: Accelerating Emu Diffusion Models with Backward Distillation

Autor: Kohler, Jonas, Pumarola, Albert, Schönfeld, Edgar, Sanakoyeu, Artsiom, Sumbaly, Roshan, Vajda, Peter, Thabet, Ali

Diffusion models are a powerful generative framework, but come with expensive inference. Existing acceleration methods often compromise image quality or fail under complex conditioning when operating in an extremely low-step regime. In this work, we

Externí odkaz: http://arxiv.org/abs/2405.05224

Zobrazit plný text záznamu

Report

Animated Stickers: Bringing Stickers to Life with Video Diffusion

Autor: Yan, David, Zhang, Winnie, Zhang, Luxin, Kalia, Anmol, Wang, Dingkang, Ramchandani, Ankit, Liu, Miao, Pumarola, Albert, Schoenfeld, Edgar, Blanchard, Elliot, Narni, Krishna, Luo, Yaqiao, Chen, Lawrence, Pang, Guan, Thabet, Ali, Vajda, Peter, Bearman, Amy, Yu, Licheng

We introduce animated stickers, a video diffusion model which generates an animation conditioned on a text prompt and static sticker image. Our model is built on top of the state-of-the-art Emu text-to-image model, with the addition of temporal layer

Externí odkaz: http://arxiv.org/abs/2402.06088

Zobrazit plný text záznamu

Report

FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis

Autor: Liang, Feng, Wu, Bichen, Wang, Jialiang, Yu, Licheng, Li, Kunpeng, Zhao, Yinan, Misra, Ishan, Huang, Jia-Bin, Zhang, Peizhao, Vajda, Peter, Marculescu, Diana

Diffusion models have transformed the image-to-image (I2I) synthesis and are now permeating into videos. However, the advancement of video-to-video (V2V) synthesis has been hampered by the challenge of maintaining temporal consistency across video fr

Externí odkaz: http://arxiv.org/abs/2312.17681

Zobrazit plný text záznamu

Report

Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis

Autor: Wu, Bichen, Chuang, Ching-Yao, Wang, Xiaoyan, Jia, Yichen, Krishnakumar, Kapil, Xiao, Tong, Liang, Feng, Yu, Licheng, Vajda, Peter

In this paper, we introduce Fairy, a minimalist yet robust adaptation of image-editing diffusion models, enhancing them for video editing applications. Our approach centers on the concept of anchor-based cross-frame attention, a mechanism that implic

Externí odkaz: http://arxiv.org/abs/2312.13834

Zobrazit plný text záznamu

Report

MixRT: Mixed Neural Representations For Real-Time NeRF Rendering

Autor: Li, Chaojian, Wu, Bichen, Vajda, Peter, Yingyan, Lin

Neural Radiance Field (NeRF) has emerged as a leading technique for novel view synthesis, owing to its impressive photorealistic reconstruction and rendering capability. Nevertheless, achieving real-time NeRF rendering in large-scale scenes has prese

Externí odkaz: http://arxiv.org/abs/2312.11841

Zobrazit plný text záznamu

Report

ControlRoom3D: Room Generation using Semantic Proxy Rooms

Autor: Schult, Jonas, Tsai, Sam, Höllein, Lukas, Wu, Bichen, Wang, Jialiang, Ma, Chih-Yao, Li, Kunpeng, Wang, Xiaofang, Wimbauer, Felix, He, Zijian, Zhang, Peizhao, Leibe, Bastian, Vajda, Peter, Hou, Ji

Manually creating 3D environments for AR/VR applications is a complex process requiring expert knowledge in 3D modeling software. Pioneering works facilitate this process by generating room meshes conditioned on textual style descriptions. Yet, many

Externí odkaz: http://arxiv.org/abs/2312.05208

Zobrazit plný text záznamu

Report

AVID: Any-Length Video Inpainting with Diffusion Model

Autor: Zhang, Zhixing, Wu, Bichen, Wang, Xiaoyan, Luo, Yaqiao, Zhang, Luxin, Zhao, Yinan, Vajda, Peter, Metaxas, Dimitris, Yu, Licheng

Recent advances in diffusion models have successfully enabled text-guided image inpainting. While it seems straightforward to extend such editing capability into the video domain, there have been fewer works regarding text-guided video inpainting. Gi

Externí odkaz: http://arxiv.org/abs/2312.03816

Zobrazit plný text záznamu

Report

Cache Me if You Can: Accelerating Diffusion Models through Block Caching

Autor: Wimbauer, Felix, Wu, Bichen, Schoenfeld, Edgar, Dai, Xiaoliang, Hou, Ji, He, Zijian, Sanakoyeu, Artsiom, Zhang, Peizhao, Tsai, Sam, Kohler, Jonas, Rupprecht, Christian, Cremers, Daniel, Vajda, Peter, Wang, Jialiang

Diffusion models have recently revolutionized the field of image synthesis due to their ability to generate photorealistic images. However, one of the major drawbacks of diffusion models is that the image generation process is costly. A large image-t

Externí odkaz: http://arxiv.org/abs/2312.03209

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání