Zobrazeno 1 - 10
of 217
pro vyhledávání: '"Wang, Shanshe"'
Autor:
Shan, Wenkang, Liu, Zhenhua, Zhang, Xinfeng, Wang, Zhao, Han, Kai, Wang, Shanshe, Ma, Siwei, Gao, Wen
In this paper, a novel Diffusion-based 3D Pose estimation (D3DP) method with Joint-wise reProjection-based Multi-hypothesis Aggregation (JPMA) is proposed for probabilistic 3D human pose estimation. On the one hand, D3DP generates multiple possible 3
Externí odkaz:
http://arxiv.org/abs/2303.11579
Video Coding for Machines (VCM) aims to compress visual signals for machine analysis. However, existing methods only consider a few machines, neglecting the majority. Moreover, the machine's perceptual characteristics are not leveraged effectively, r
Externí odkaz:
http://arxiv.org/abs/2211.06797
Although significant achievements have been achieved by recurrent neural network (RNN) based video prediction methods, their performance in datasets with high resolutions is still far from satisfactory because of the information loss problem and the
Externí odkaz:
http://arxiv.org/abs/2206.04381
As a highly ill-posed issue, single image super-resolution (SISR) has been widely investigated in recent years. The main task of SISR is to recover the information loss caused by the degradation procedure. According to the Nyquist sampling theory, th
Externí odkaz:
http://arxiv.org/abs/2206.03361
Image super-resolution (SR) has been widely investigated in recent years. However, it is challenging to fairly estimate the performance of various SR methods, as the lack of reliable and accurate criteria for the perceptual quality. Existing metrics
Externí odkaz:
http://arxiv.org/abs/2205.13847
Bit-depth expansion (BDE) is one of the emerging technologies to display high bit-depth (HBD) image from low bit-depth (LBD) source. Existing BDE methods have no unified solution for various BDE situations, and directly learn a mapping for each pixel
Externí odkaz:
http://arxiv.org/abs/2204.12039
Video prediction aims to predict future frames by modeling the complex spatiotemporal dynamics in videos. However, most of the existing methods only model the temporal information and the spatial information for videos in an independent manner but ha
Externí odkaz:
http://arxiv.org/abs/2204.09456
Although many video prediction methods have obtained good performance in low-resolution (64$\sim$128) videos, predictive models for high-resolution (512$\sim$4K) videos have not been fully explored yet, which are more meaningful due to the increasing
Externí odkaz:
http://arxiv.org/abs/2203.16084
This paper introduces a novel Pre-trained Spatial Temporal Many-to-One (P-STMO) model for 2D-to-3D human pose estimation task. To reduce the difficulty of capturing spatial and temporal information, we divide this task into two stages: pre-training (
Externí odkaz:
http://arxiv.org/abs/2203.07628
It is challenging to restore low-resolution (LR) images to super-resolution (SR) images with correct and clear details. Existing deep learning works almost neglect the inherent structural information of images, which acts as an important role for vis
Externí odkaz:
http://arxiv.org/abs/2201.01458