Výsledky vyhledávání

Report

LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity

Autor: Wang, Hongjie, Ma, Chih-Yao, Liu, Yen-Cheng, Hou, Ji, Xu, Tao, Wang, Jialiang, Juefei-Xu, Felix, Luo, Yaqiao, Zhang, Peizhao, Hou, Tingbo, Vajda, Peter, Jha, Niraj K., Dai, Xiaoliang

Text-to-video generation enhances content creation but is highly computationally intensive: The computational cost of Diffusion Transformers (DiTs) scales quadratically in the number of pixels. This makes minute-length video generation extremely expe

Externí odkaz: http://arxiv.org/abs/2412.09856

Zobrazit plný text záznamu

Report

Movie Gen: A Cast of Media Foundation Models

Autor: Polyak, Adam, Zohar, Amit, Brown, Andrew, Tjandra, Andros, Sinha, Animesh, Lee, Ann, Vyas, Apoorv, Shi, Bowen, Ma, Chih-Yao, Chuang, Ching-Yao, Yan, David, Choudhary, Dhruv, Wang, Dingkang, Sethi, Geet, Pang, Guan, Ma, Haoyu, Misra, Ishan, Hou, Ji, Wang, Jialiang, Jagadeesh, Kiran, Li, Kunpeng, Zhang, Luxin, Singh, Mannat, Williamson, Mary, Le, Matt, Yu, Matthew, Singh, Mitesh Kumar, Zhang, Peizhao, Vajda, Peter, Duval, Quentin, Girdhar, Rohit, Sumbaly, Roshan, Rambhatla, Sai Saketh, Tsai, Sam, Azadi, Samaneh, Datta, Samyak, Chen, Sanyuan, Bell, Sean, Ramaswamy, Sharadh, Sheynin, Shelly, Bhattacharya, Siddharth, Motwani, Simran, Xu, Tao, Li, Tianhe, Hou, Tingbo, Hsu, Wei-Ning, Yin, Xi, Dai, Xiaoliang, Taigman, Yaniv, Luo, Yaqiao, Liu, Yen-Cheng, Wu, Yi-Chiao, Zhao, Yue, Kirstain, Yuval, He, Zecheng, He, Zijian, Pumarola, Albert, Thabet, Ali, Sanakoyeu, Artsiom, Mallya, Arun, Guo, Baishan, Araya, Boris, Kerr, Breena, Wood, Carleigh, Liu, Ce, Peng, Cen, Vengertsev, Dimitry, Schonfeld, Edgar, Blanchard, Elliot, Juefei-Xu, Felix, Nord, Fraylie, Liang, Jeff, Hoffman, John, Kohler, Jonas, Fire, Kaolin, Sivakumar, Karthik, Chen, Lawrence, Yu, Licheng, Gao, Luya, Georgopoulos, Markos, Moritz, Rashel, Sampson, Sara K., Li, Shikai, Parmeggiani, Simone, Fine, Steve, Fowler, Tara, Petrovic, Vladan, Du, Yuming

We present Movie Gen, a cast of foundation models that generates high-quality, 1080p HD videos with different aspect ratios and synchronized audio. We also show additional capabilities such as precise instruction-based video editing and generation of

Externí odkaz: http://arxiv.org/abs/2410.13720

Zobrazit plný text záznamu

Report

Pixel-Space Post-Training of Latent Diffusion Models

Autor: Zhang, Christina, Motwani, Simran, Yu, Matthew, Hou, Ji, Juefei-Xu, Felix, Tsai, Sam, Vajda, Peter, He, Zijian, Wang, Jialiang

Latent diffusion models (LDMs) have made significant advancements in the field of image generation in recent years. One major advantage of LDMs is their ability to operate in a compressed latent space, allowing for more efficient training and deploym

Externí odkaz: http://arxiv.org/abs/2409.17565

Zobrazit plný text záznamu

Report

Imagine yourself: Tuning-Free Personalized Image Generation

Autor: He, Zecheng, Sun, Bo, Juefei-Xu, Felix, Ma, Haoyu, Ramchandani, Ankit, Cheung, Vincent, Shah, Siddharth, Kalia, Anmol, Subramanyam, Harihar, Zareian, Alireza, Chen, Li, Jain, Ankit, Zhang, Ning, Zhang, Peizhao, Sumbaly, Roshan, Vajda, Peter, Sinha, Animesh

Diffusion models have demonstrated remarkable efficacy across various image-to-image tasks. In this research, we introduce Imagine yourself, a state-of-the-art model designed for personalized image generation. Unlike conventional tuning-based persona

Externí odkaz: http://arxiv.org/abs/2409.13346

Zobrazit plný text záznamu

Report

An Update on the Hypothetical X17 Particle

Autor: Krasznahorky, A. J., Krasznahorkay, A., Csatlós, M., Timár, J., Begala, M., Krakó, A., Rajta, I., Vajda, I., Sas, N. J.

Recently, when examining the differential internal pair creation coefficients of $^8$Be, $^4$He and $^{12}$C nuclei, we observed peak-like anomalies in the angular correlation of the e$^+$e$^-$ pairs. This was interpreted as the creation and immediat

Externí odkaz: http://arxiv.org/abs/2409.16300

Zobrazit plný text záznamu

Report

Imagine Flash: Accelerating Emu Diffusion Models with Backward Distillation

Autor: Kohler, Jonas, Pumarola, Albert, Schönfeld, Edgar, Sanakoyeu, Artsiom, Sumbaly, Roshan, Vajda, Peter, Thabet, Ali

Diffusion models are a powerful generative framework, but come with expensive inference. Existing acceleration methods often compromise image quality or fail under complex conditioning when operating in an extremely low-step regime. In this work, we

Externí odkaz: http://arxiv.org/abs/2405.05224

Zobrazit plný text záznamu

Report

Animated Stickers: Bringing Stickers to Life with Video Diffusion

Autor: Yan, David, Zhang, Winnie, Zhang, Luxin, Kalia, Anmol, Wang, Dingkang, Ramchandani, Ankit, Liu, Miao, Pumarola, Albert, Schoenfeld, Edgar, Blanchard, Elliot, Narni, Krishna, Luo, Yaqiao, Chen, Lawrence, Pang, Guan, Thabet, Ali, Vajda, Peter, Bearman, Amy, Yu, Licheng

We introduce animated stickers, a video diffusion model which generates an animation conditioned on a text prompt and static sticker image. Our model is built on top of the state-of-the-art Emu text-to-image model, with the addition of temporal layer

Externí odkaz: http://arxiv.org/abs/2402.06088

Zobrazit plný text záznamu

Report

FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis

Autor: Liang, Feng, Wu, Bichen, Wang, Jialiang, Yu, Licheng, Li, Kunpeng, Zhao, Yinan, Misra, Ishan, Huang, Jia-Bin, Zhang, Peizhao, Vajda, Peter, Marculescu, Diana

Diffusion models have transformed the image-to-image (I2I) synthesis and are now permeating into videos. However, the advancement of video-to-video (V2V) synthesis has been hampered by the challenge of maintaining temporal consistency across video fr

Externí odkaz: http://arxiv.org/abs/2312.17681

Zobrazit plný text záznamu

Report

Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis

Autor: Wu, Bichen, Chuang, Ching-Yao, Wang, Xiaoyan, Jia, Yichen, Krishnakumar, Kapil, Xiao, Tong, Liang, Feng, Yu, Licheng, Vajda, Peter

In this paper, we introduce Fairy, a minimalist yet robust adaptation of image-editing diffusion models, enhancing them for video editing applications. Our approach centers on the concept of anchor-based cross-frame attention, a mechanism that implic

Externí odkaz: http://arxiv.org/abs/2312.13834

Zobrazit plný text záznamu

Report

MixRT: Mixed Neural Representations For Real-Time NeRF Rendering

Autor: Li, Chaojian, Wu, Bichen, Vajda, Peter, Yingyan, Lin

Neural Radiance Field (NeRF) has emerged as a leading technique for novel view synthesis, owing to its impressive photorealistic reconstruction and rendering capability. Nevertheless, achieving real-time NeRF rendering in large-scale scenes has prese

Externí odkaz: http://arxiv.org/abs/2312.11841

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání