Výsledky vyhledávání - "Villegas, Ruben"

Report

Autor: Hernandez, Jefferson, Villegas, Ruben, Ordonez, Vicente

We propose to use automatically generated instruction-following data to improve the zero-shot capabilities of a large multimodal model with additional support for generative and image editing tasks. We achieve this by curating a new multimodal instru

Externí odkaz: http://arxiv.org/abs/2406.11262

Zobrazit plný text záznamu

Report

Text Prompting for Multi-Concept Video Customization by Autoregressive Generation

Autor: Kothandaraman, Divya, Sohn, Kihyuk, Villegas, Ruben, Voigtlaender, Paul, Manocha, Dinesh, Babaeizadeh, Mohammad

We present a method for multi-concept customization of pretrained text-to-video (T2V) models. Intuitively, the multi-concept customized video can be derived from the (non-linear) intersection of the video manifolds of the individual concepts, which i

Externí odkaz: http://arxiv.org/abs/2405.13951

Zobrazit plný text záznamu

Report

StoryBench: A Multifaceted Benchmark for Continuous Story Visualization

Autor: Bugliarello, Emanuele, Moraldo, Hernan, Villegas, Ruben, Babaeizadeh, Mohammad, Saffar, Mohammad Taghi, Zhang, Han, Erhan, Dumitru, Ferrari, Vittorio, Kindermans, Pieter-Jan, Voigtlaender, Paul

Generating video stories from text prompts is a complex task. In addition to having high visual quality, videos need to realistically adhere to a sequence of text prompts whilst being consistent throughout the frames. Creating a benchmark for video g

Externí odkaz: http://arxiv.org/abs/2308.11606

Zobrazit plný text záznamu

Report

ViC-MAE: Self-Supervised Representation Learning from Images and Video with Contrastive Masked Autoencoders

Autor: Hernandez, Jefferson, Villegas, Ruben, Ordonez, Vicente

We propose ViC-MAE, a model that combines both Masked AutoEncoders (MAE) and contrastive learning. ViC-MAE is trained using a global featured obtained by pooling the local representations learned under an MAE reconstruction loss and leveraging this r

Externí odkaz: http://arxiv.org/abs/2303.12001

Zobrazit plný text záznamu

Report

Phenaki: Variable Length Video Generation From Open Domain Textual Description

Autor: Villegas, Ruben, Babaeizadeh, Mohammad, Kindermans, Pieter-Jan, Moraldo, Hernan, Zhang, Han, Saffar, Mohammad Taghi, Castro, Santiago, Kunze, Julius, Erhan, Dumitru

We present Phenaki, a model capable of realistic video synthesis, given a sequence of textual prompts. Generating videos from text is particularly challenging due to the computational cost, limited quantities of high quality text-video data and varia

Externí odkaz: http://arxiv.org/abs/2210.02399

Zobrazit plný text záznamu

Report

RiCS: A 2D Self-Occlusion Map for Harmonizing Volumetric Objects

Autor: Jang, Yunseok, Villegas, Ruben, Yang, Jimei, Ceylan, Duygu, Sun, Xin, Lee, Honglak

There have been remarkable successes in computer vision with deep learning. While such breakthroughs show robust performance, there have still been many challenges in learning in-depth knowledge, like occlusion or predicting physical interactions. Al

Externí odkaz: http://arxiv.org/abs/2205.06975

Zobrazit plný text záznamu

Report

Contact-Aware Retargeting of Skinned Motion

Autor: Villegas, Ruben, Ceylan, Duygu, Hertzmann, Aaron, Yang, Jimei, Saito, Jun

This paper introduces a motion retargeting method that preserves self-contacts and prevents interpenetration. Self-contacts, such as when hands touch each other or the torso or the head, are important attributes of human body language and dynamics, y

Externí odkaz: http://arxiv.org/abs/2109.07431

Zobrazit plný text záznamu

Report

Stochastic Scene-Aware Motion Prediction

Autor: Hassan, Mohamed, Ceylan, Duygu, Villegas, Ruben, Saito, Jun, Yang, Jimei, Zhou, Yi, Black, Michael

A long-standing goal in computer vision is to capture, model, and realistically synthesize human behavior. Specifically, by learning from data, our goal is to enable virtual humans to navigate within cluttered indoor scenes and naturally interact wit

Externí odkaz: http://arxiv.org/abs/2108.08284

Zobrazit plný text záznamu

Report

Single-image Full-body Human Relighting

Autor: Lagunas, Manuel, Sun, Xin, Yang, Jimei, Villegas, Ruben, Zhang, Jianming, Shu, Zhixin, Masia, Belen, Gutierrez, Diego

Publikováno v: Eurographics Symposium on Rendering (EGSR), 2021

We present a single-image data-driven method to automatically relight images with full-body humans in them. Our framework is based on a realistic scene decomposition leveraging precomputed radiance transfer (PRT) and spherical harmonics (SH) lighting

Externí odkaz: http://arxiv.org/abs/2107.07259

Zobrazit plný text záznamu

Report

Task-Generic Hierarchical Human Motion Prior using VAEs

Autor: Li, Jiaman, Villegas, Ruben, Ceylan, Duygu, Yang, Jimei, Kuang, Zhengfei, Li, Hao, Zhao, Yajie

A deep generative model that describes human motions can benefit a wide range of fundamental computer vision and graphics tasks, such as providing robustness to video-based human pose estimation, predicting complete body movements for motion capture

Externí odkaz: http://arxiv.org/abs/2106.04004

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání