Video prediction: a step-by-step improvement of a video synthesis network
Autor: | Liyong Bao, Hongwei Ding, Zhijun Yang, Bo Li, Beibei Jing |
---|---|
Rok vydání: | 2021 |
Předmět: |
Pixel
Artificial neural network Computer science business.industry Frame (networking) ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION Field (computer science) Convolution Set (abstract data type) Artificial Intelligence Computer Science::Computer Vision and Pattern Recognition Computer Science::Multimedia Network performance Computer vision Artificial intelligence business Generator (mathematics) |
Zdroj: | Applied Intelligence. 52:3640-3652 |
ISSN: | 1573-7497 0924-669X |
DOI: | 10.1007/s10489-021-02500-5 |
Popis: | Although focusing on the field of video generation has made some progress in network performance and computational efficiency, there is still much room for improvement in terms of the predicted frame number and clarity. In this paper, a depth learning model is proposed to predict future video frames. The model can predict video streams with complex pixel distributions of up to 32 frames. Our framework is mainly composed of two modules: a fusion image prediction generator and an image-video translator. The fusion picture prediction generator is realized by a U-Net neural network built by a 3D convolution, and the image-video translator is composed of a conditional generative adversarial network built by a 2D convolution network. In the proposed framework, given a set of fusion images and labels, the image picture prediction generator can learn the pixel distribution of the fitted label pictures from the fusion images. The image-video translator then translates the output of the fused image prediction generator into future video frames. In addition, this paper proposes an accompanying convolution model and corresponding algorithm for improving image sharpness. Our experimental results prove the effectiveness of this framework. |
Databáze: | OpenAIRE |
Externí odkaz: |