Výsledky vyhledávání - "Zhang, Chenxu"

Report

DiffTED: One-shot Audio-driven TED Talk Video Generation with Diffusion-based Co-speech Gestures

Autor: Hogue, Steven, Zhang, Chenxu, Daruger, Hamza, Tian, Yapeng, Guo, Xiaohu

Audio-driven talking video generation has advanced significantly, but existing methods often depend on video-to-video translation techniques and traditional generative networks like GANs and they typically generate taking heads and co-speech gestures

Externí odkaz: http://arxiv.org/abs/2409.07649

Zobrazit plný text záznamu

Report

Urban Waterlogging Detection: A Challenging Benchmark and Large-Small Model Co-Adapter

Autor: Song, Suqi, Zhang, Chenxu, Zhang, Peng, Li, Pengkun, Song, Fenglong, Zhang, Lei

Urban waterlogging poses a major risk to public safety and infrastructure. Conventional methods using water-level sensors need high-maintenance to hardly achieve full coverage. Recent advances employ surveillance camera imagery and deep learning for

Externí odkaz: http://arxiv.org/abs/2407.08109

Zobrazit plný text záznamu

Report

Vision Mamba: A Comprehensive Survey and Taxonomy

Autor: Liu, Xiao, Zhang, Chenxu, Zhang, Lei

State Space Model (SSM) is a mathematical model used to describe and analyze the behavior of dynamic systems. This model has witnessed numerous applications in several fields, including control theory, signal processing, economics and machine learnin

Externí odkaz: http://arxiv.org/abs/2405.04404

Zobrazit plný text záznamu

Report

Magic-Boost: Boost 3D Generation with Mutli-View Conditioned Diffusion

Autor: Yang, Fan, Zhang, Jianfeng, Shi, Yichun, Chen, Bowen, Zhang, Chenxu, Zhang, Huichao, Yang, Xiaofeng, Feng, Jiashi, Lin, Guosheng

Benefiting from the rapid development of 2D diffusion models, 3D content creation has made significant progress recently. One promising solution involves the fine-tuning of pre-trained 2D diffusion models to harness their capacity for producing multi

Externí odkaz: http://arxiv.org/abs/2404.06429

Zobrazit plný text záznamu

Report

Robust Active Speaker Detection in Noisy Environments

Autor: Vasireddy, Siva Sai Nagender, Zhang, Chenxu, Guo, Xiaohu, Tian, Yapeng

This paper addresses the issue of active speaker detection (ASD) in noisy environments and formulates a robust active speaker detection (rASD) problem. Existing ASD approaches leverage both audio and visual modalities, but non-speech sounds in the su

Externí odkaz: http://arxiv.org/abs/2403.19002

Zobrazit plný text záznamu

Report

Sora Generates Videos with Stunning Geometrical Consistency

Autor: Li, Xuanyi, Zhou, Daquan, Zhang, Chenxu, Wei, Shaodong, Hou, Qibin, Cheng, Ming-Ming

The recently developed Sora model [1] has exhibited remarkable capabilities in video generation, sparking intense discussions regarding its ability to simulate real-world phenomena. Despite its growing popularity, there is a lack of established metri

Externí odkaz: http://arxiv.org/abs/2402.17403

Zobrazit plný text záznamu

Report

DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation

Autor: Zhang, Chenxu, Wang, Chao, Zhang, Jianfeng, Xu, Hongyi, Song, Guoxian, Xie, You, Luo, Linjie, Tian, Yapeng, Guo, Xiaohu, Feng, Jiashi

The generation of emotional talking faces from a single portrait image remains a significant challenge. The simultaneous achievement of expressive emotional talking and accurate lip-sync is particularly difficult, as expressiveness is often compromis

Externí odkaz: http://arxiv.org/abs/2312.13578

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání