Výsledky vyhledávání

Report

VidTok: A Versatile and Open-Source Video Tokenizer

Autor: Tang, Anni, He, Tianyu, Guo, Junliang, Cheng, Xinle, Song, Li, Bian, Jiang

Encoding video content into compact latent tokens has become a fundamental step in video generation and understanding, driven by the need to address the inherent redundancy in pixel-level representations. Consequently, there is a growing demand for h

Externí odkaz: http://arxiv.org/abs/2412.13061

Zobrazit plný text záznamu

Report

Controllable Distortion-Perception Tradeoff Through Latent Diffusion for Neural Image Compression

Autor: Zhou, Chuqin, Lu, Guo, Li, Jiangchuan, Chen, Xiangyu, Cheng, Zhengxue, Song, Li, Zhang, Wenjun

Neural image compression often faces a challenging trade-off among rate, distortion and perception. While most existing methods typically focus on either achieving high pixel-level fidelity or optimizing for perceptual metrics, we propose a novel app

Externí odkaz: http://arxiv.org/abs/2412.11379

Zobrazit plný text záznamu

Report

VRVVC: Variable-Rate NeRF-Based Volumetric Video Compression

Autor: Hu, Qiang, Zhong, Houqiang, Zheng, Zihan, Zhang, Xiaoyun, Cheng, Zhengxue, Song, Li, Zhai, Guangtao, Wang, Yanfeng

Neural Radiance Field (NeRF)-based volumetric video has revolutionized visual media by delivering photorealistic Free-Viewpoint Video (FVV) experiences that provide audiences with unprecedented immersion and interactivity. However, the substantial da

Externí odkaz: http://arxiv.org/abs/2412.11362

Zobrazit plný text záznamu

Report

Face De-identification: State-of-the-art Methods and Comparative Studies

Autor: Cao, Jingyi, Chen, Xiangyi, Liu, Bo, Ding, Ming, Xie, Rong, Song, Li, Li, Zhu, Zhang, Wenjun

The widespread use of image acquisition technologies, along with advances in facial recognition, has raised serious privacy concerns. Face de-identification usually refers to the process of concealing or replacing personal identifiers, which is regar

Externí odkaz: http://arxiv.org/abs/2411.09863

Zobrazit plný text záznamu

Report

Rate-aware Compression for NeRF-based Volumetric Video

Autor: Zhang, Zhiyu, Lu, Guo, Liang, Huanxiong, Cheng, Zhengxue, Tang, Anni, Song, Li

The neural radiance fields (NeRF) have advanced the development of 3D volumetric video technology, but the large data volumes they involve pose significant challenges for storage and transmission. To address these problems, the existing solutions typ

Externí odkaz: http://arxiv.org/abs/2411.05322

Zobrazit plný text záznamu

Report

Content-Adaptive Rate-Quality Curve Prediction Model in Media Processing System

Autor: Yin, Shibo, Zhang, Zhiyu, Ning, Peirong, Chen, Qiubo, Chen, Jing, Zhou, Quan, Song, Li

In streaming media services, video transcoding is a common practice to alleviate bandwidth demands. Unfortunately, traditional methods employing a uniform rate factor (RF) across all videos often result in significant inefficiencies. Content-adaptive

Externí odkaz: http://arxiv.org/abs/2411.05295

Zobrazit plný text záznamu

Report

Modulating Anisotropic Magnetism of Layered CuCrP$_{2}$S$_{6}$ Single Crystal via Selenium Substitution

Autor: Eid, I. S., Song, Li

As one of typical layered structures, antiferromagnetic CuCrP$_{2}$S$_{6}$ single crystal has high potential for magnetoelectric devices and spintronic multi-terminal chips due to its unique magnetic anisotropy. However, to tune the anisotropic magne

Externí odkaz: http://arxiv.org/abs/2409.12509

Zobrazit plný text záznamu

Report

PoseTalk: Text-and-Audio-based Pose Control and Motion Refinement for One-Shot Talking Head Generation

Autor: Ling, Jun, Wang, Yiwen, Xue, Han, Xie, Rong, Song, Li

While previous audio-driven talking head generation (THG) methods generate head poses from driving audio, the generated poses or lips cannot match the audio well or are not editable. In this study, we propose \textbf{PoseTalk}, a THG system that can

Externí odkaz: http://arxiv.org/abs/2409.02657

Zobrazit plný text záznamu

Report

A New People-Object Interaction Dataset and NVS Benchmarks

Autor: Guo, Shuai, Zhong, Houqiang, Wang, Qiuwen, Chen, Ziyu, Gao, Yijie, Yuan, Jiajing, Zhang, Chenyu, Xie, Rong, Song, Li

Recently, NVS in human-object interaction scenes has received increasing attention. Existing human-object interaction datasets mainly consist of static data with limited views, offering only RGB images or videos, mostly containing interactions betwee

Externí odkaz: http://arxiv.org/abs/2409.12980

Zobrazit plný text záznamu

Report

OmniRe: Omni Urban Scene Reconstruction

Autor: Chen, Ziyu, Yang, Jiawei, Huang, Jiahui, de Lutio, Riccardo, Esturo, Janick Martinez, Ivanovic, Boris, Litany, Or, Gojcic, Zan, Fidler, Sanja, Pavone, Marco, Song, Li, Wang, Yue

We introduce OmniRe, a holistic approach for efficiently reconstructing high-fidelity dynamic urban scenes from on-device logs. Recent methods for modeling driving sequences using neural radiance fields or Gaussian Splatting have demonstrated the pot

Externí odkaz: http://arxiv.org/abs/2408.16760

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání