Výsledky vyhledávání - "Owens, Andrew P"

Report

Motion Prompting: Controlling Video Generation with Motion Trajectories

Autor: Geng, Daniel, Herrmann, Charles, Hur, Junhwa, Cole, Forrester, Zhang, Serena, Pfaff, Tobias, Lopez-Guevara, Tatiana, Doersch, Carl, Aytar, Yusuf, Rubinstein, Michael, Sun, Chen, Wang, Oliver, Owens, Andrew, Sun, Deqing

Motion control is crucial for generating expressive and compelling video content; however, most existing video generation models rely mainly on text prompts for control, which struggle to capture the nuances of dynamic actions and temporal compositio

Externí odkaz: http://arxiv.org/abs/2412.02700

Zobrazit plný text záznamu

Report

Video-Guided Foley Sound Generation with Multimodal Controls

Autor: Chen, Ziyang, Seetharaman, Prem, Russell, Bryan, Nieto, Oriol, Bourgin, David, Owens, Andrew, Salamon, Justin

Generating sound effects for videos often requires creating artistic sound effects that diverge significantly from real-life sources and flexible control in the sound design. To address this problem, we introduce MultiFoley, a model designed for vide

Externí odkaz: http://arxiv.org/abs/2411.17698

Zobrazit plný text záznamu

Report

Community Forensics: Using Thousands of Generators to Train Fake Image Detectors

Autor: Park, Jeongsoo, Owens, Andrew

One of the key challenges of detecting AI-generated images is spotting images that have been created by previously unseen generative models. We argue that the limited diversity of the training data is a major obstacle to addressing this problem, and

Externí odkaz: http://arxiv.org/abs/2411.04125

Zobrazit plný text záznamu

Report

Contrastive Touch-to-Touch Pretraining

Autor: Rodriguez, Samanta, Dou, Yiming, Bogert, William van den, Oller, Miquel, So, Kevin, Owens, Andrew, Fazeli, Nima

Today's tactile sensors have a variety of different designs, making it challenging to develop general-purpose methods for processing touch signals. In this paper, we learn a unified representation that captures the shared information between differen

Externí odkaz: http://arxiv.org/abs/2410.11834

Zobrazit plný text záznamu

Report

Self-Supervised Any-Point Tracking by Contrastive Random Walks

Autor: Shrivastava, Ayush, Owens, Andrew

We present a simple, self-supervised approach to the Tracking Any Point (TAP) problem. We train a global matching transformer to find cycle consistent tracks through video via contrastive random walks, using the transformer's attention-based global m

Externí odkaz: http://arxiv.org/abs/2409.16288

Zobrazit plný text záznamu

Report

Self-Supervised Audio-Visual Soundscape Stylization

Autor: Li, Tingle, Wang, Renhao, Huang, Po-Yao, Owens, Andrew, Anumanchipalli, Gopala

Speech sounds convey a great deal of information about the scenes, resulting in a variety of effects ranging from reverberation to additional ambient sounds. In this paper, we manipulate input speech to sound as though it was recorded within a differ

Externí odkaz: http://arxiv.org/abs/2409.14340

Zobrazit plný text záznamu

Report

Tactile Functasets: Neural Implicit Representations of Tactile Datasets

Autor: Li, Sikai, Rodriguez, Samanta, Dou, Yiming, Owens, Andrew, Fazeli, Nima

Modern incarnations of tactile sensors produce high-dimensional raw sensory feedback such as images, making it challenging to efficiently store, process, and generalize across sensors. To address these concerns, we introduce a novel implicit function

Externí odkaz: http://arxiv.org/abs/2409.14592

Zobrazit plný text záznamu

Report

Touch2Touch: Cross-Modal Tactile Generation for Object Manipulation

Autor: Rodriguez, Samanta, Dou, Yiming, Oller, Miquel, Owens, Andrew, Fazeli, Nima

Today's touch sensors come in many shapes and sizes. This has made it challenging to develop general-purpose touch processing methods since models are generally tied to one specific sensor design. We address this problem by performing cross-modal pre

Externí odkaz: http://arxiv.org/abs/2409.08269

Zobrazit plný text záznamu

Report

Images that Sound: Composing Images and Sounds on a Single Canvas

Autor: Chen, Ziyang, Geng, Daniel, Owens, Andrew

Spectrograms are 2D representations of sound that look very different from the images found in our visual world. And natural images, when played as spectrograms, make unnatural sounds. In this paper, we show that it is possible to synthesize spectrog

Externí odkaz: http://arxiv.org/abs/2405.12221

Zobrazit plný text záznamu

Report

Efficient Vision-Language Pre-training by Cluster Masking

Autor: Wei, Zihao, Pan, Zixuan, Owens, Andrew

We propose a simple strategy for masking image patches during visual-language contrastive learning that improves the quality of the learned representations and the training speed. During each iteration of training, we randomly mask clusters of visual

Externí odkaz: http://arxiv.org/abs/2405.08815

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání