Výsledky vyhledávání - "A. P. Skorokhodov"

Report

VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control

Autor: Bahmani, Sherwin, Skorokhodov, Ivan, Siarohin, Aliaksandr, Menapace, Willi, Qian, Guocheng, Vasilkovsky, Michael, Lee, Hsin-Ying, Wang, Chaoyang, Zou, Jiaxu, Tagliasacchi, Andrea, Lindell, David B., Tulyakov, Sergey

Modern text-to-video synthesis models demonstrate coherent, photorealistic generation of complex videos from a text description. However, most existing models lack fine-grained control over camera movement, which is critical for downstream applicatio

Externí odkaz: http://arxiv.org/abs/2407.12781

Zobrazit plný text záznamu

Report

VIMI: Grounding Video Generation through Multi-modal Instruction

Autor: Fang, Yuwei, Menapace, Willi, Siarohin, Aliaksandr, Chen, Tsai-Shien, Wang, Kuan-Chien, Skorokhodov, Ivan, Neubig, Graham, Tulyakov, Sergey

Existing text-to-video diffusion models rely solely on text-only encoders for their pretraining. This limitation stems from the absence of large-scale multimodal prompt video datasets, resulting in a lack of visual grounding and restricting their ver

Externí odkaz: http://arxiv.org/abs/2407.06304

Zobrazit plný text záznamu

Report

VIA: Unified Spatiotemporal Video Adaptation Framework for Global and Local Video Editing

Autor: Gu, Jing, Fang, Yuwei, Skorokhodov, Ivan, Wonka, Peter, Du, Xinya, Tulyakov, Sergey, Wang, Xin Eric

Video editing is a cornerstone of digital media, from entertainment and education to professional communication. However, previous methods often overlook the necessity of comprehensively understanding both global and local contexts, leading to inaccu

Externí odkaz: http://arxiv.org/abs/2406.12831

Zobrazit plný text záznamu

Report

SF-V: Single Forward Video Generation Model

Autor: Zhang, Zhixing, Li, Yanyu, Wu, Yushu, Xu, Yanwu, Kag, Anil, Skorokhodov, Ivan, Menapace, Willi, Siarohin, Aliaksandr, Cao, Junli, Metaxas, Dimitris, Tulyakov, Sergey, Ren, Jian

Diffusion-based video generation models have demonstrated remarkable success in obtaining high-fidelity videos through the iterative denoising process. However, these models require multiple denoising steps during sampling, resulting in high computat

Externí odkaz: http://arxiv.org/abs/2406.04324

Zobrazit plný text záznamu

Akademický článek

Some Questions of the Mixed Anxiety-Depressive Disorders in Patients with Cerebellar Stroke in the Early Period of Convalescence

Autor: V. A. Kutashov, O. V. Ulyanova, I. S. Protasov, A. P. Skorokhodov, O. V. Zolotaryov, L. S. Nemykh, E. S. Ananyeva, A. A. Dudina, M. V. Uvarova

Publikováno v: International Journal of Biomedicine, Vol 10, Iss 2, Pp 108-111 (2020)

Background: The objective of this study was to investigate the medico-social and mixed anxiety–depressive disorders (MADD) in patients with cerebellar stroke (CS) in the early recovery period. Methods and Results: The study included 140 patients

Externí odkaz: https://doaj.org/article/a9749acc99774b7587df4fdd5ccdfbab

Zobrazit plný text záznamu

Report

TC4D: Trajectory-Conditioned Text-to-4D Generation

Autor: Bahmani, Sherwin, Liu, Xian, Yifan, Wang, Skorokhodov, Ivan, Rong, Victor, Liu, Ziwei, Liu, Xihui, Park, Jeong Joon, Tulyakov, Sergey, Wetzstein, Gordon, Tagliasacchi, Andrea, Lindell, David B.

Recent techniques for text-to-4D generation synthesize dynamic 3D scenes using supervision from pre-trained text-to-video models. However, existing representations for motion, such as deformation models or time-dependent neural representations, are l

Externí odkaz: http://arxiv.org/abs/2403.17920

Zobrazit plný text záznamu

Report

Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis

Autor: Menapace, Willi, Siarohin, Aliaksandr, Skorokhodov, Ivan, Deyneka, Ekaterina, Chen, Tsai-Shien, Kag, Anil, Fang, Yuwei, Stoliar, Aleksei, Ricci, Elisa, Ren, Jian, Tulyakov, Sergey

Contemporary models for generating images show remarkable quality and versatility. Swayed by these advantages, the research community repurposes them to generate videos. Since video content is highly redundant, we argue that naively bringing advances

Externí odkaz: http://arxiv.org/abs/2402.14797

Zobrazit plný text záznamu

Report

AToM: Amortized Text-to-Mesh using 2D Diffusion

Autor: Qian, Guocheng, Cao, Junli, Siarohin, Aliaksandr, Kant, Yash, Wang, Chaoyang, Vasilkovsky, Michael, Lee, Hsin-Ying, Fang, Yuwei, Skorokhodov, Ivan, Zhuang, Peiye, Gilitschenski, Igor, Ren, Jian, Ghanem, Bernard, Aberman, Kfir, Tulyakov, Sergey

We introduce Amortized Text-to-Mesh (AToM), a feed-forward text-to-mesh framework optimized across multiple text prompts simultaneously. In contrast to existing text-to-3D methods that often entail time-consuming per-prompt optimization and commonly

Externí odkaz: http://arxiv.org/abs/2402.00867

Zobrazit plný text záznamu

Report

4D-fy: Text-to-4D Generation Using Hybrid Score Distillation Sampling

Autor: Bahmani, Sherwin, Skorokhodov, Ivan, Rong, Victor, Wetzstein, Gordon, Guibas, Leonidas, Wonka, Peter, Tulyakov, Sergey, Park, Jeong Joon, Tagliasacchi, Andrea, Lindell, David B.

Recent breakthroughs in text-to-4D generation rely on pre-trained text-to-image and text-to-video models to generate dynamic 3D scenes. However, current text-to-4D methods face a three-way tradeoff between the quality of scene appearance, 3D structur

Externí odkaz: http://arxiv.org/abs/2311.17984

Zobrazit plný text záznamu

Report

HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion

Autor: Liu, Xian, Ren, Jian, Siarohin, Aliaksandr, Skorokhodov, Ivan, Li, Yanyu, Lin, Dahua, Liu, Xihui, Liu, Ziwei, Tulyakov, Sergey

Despite significant advances in large-scale text-to-image models, achieving hyper-realistic human image generation remains a desirable yet unsolved task. Existing models like Stable Diffusion and DALL-E 2 tend to generate human images with incoherent

Externí odkaz: http://arxiv.org/abs/2310.08579

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání