A study on the impact of training data in CNN-based super-resolution for low bitrate end-to-end video coding

Autor: Gildas Cocherel, Nicolas Dhollande, Wassim Hamidouche, Fatemeh Nasiri, Luce Morin
Přispěvatelé: Institut de Recherche Technologique b-com (IRT b-com), Institut d'Électronique et des Technologies du numéRique (IETR), Université de Nantes (UN)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS), Institut National des Sciences Appliquées (INSA), AVIWEST, Nantes Université (NU)-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS), Université de Nantes (UN)-Université de Rennes 1 (UR1)
Jazyk: angličtina
Rok vydání: 2020
Předmět:
Zdroj: Tenth International Conference on Image Processing Theory, Tools and Applications (IPTA)
Tenth International Conference on Image Processing Theory, Tools and Applications (IPTA), Nov 2020, Paris, France. pp.1-5, ⟨10.1109/IPTA50016.2020.9286717⟩
2020 Tenth International Conference on Image Processing Theory, Tools and Applications (IPTA)
2020 Tenth International Conference on Image Processing Theory, Tools and Applications (IPTA), Nov 2020, Paris, France. pp.1-5, ⟨10.1109/IPTA50016.2020.9286717⟩
IPTA
Popis: International audience; In this study, the effectiveness of Super Resolution (SR) methods based on Convolutional Neural Network (CNN) in low bitrate video coding, with a focus on the Versatile Video Coding Standard (VVC), is investigated. Video transmission over networks with limited bandwidth is a common challenge for different applications. One solution is to adopt SR methods where the main principle is to spatially downsample the input sequence prior to the encoding, then up-sampling the decoded sequence before displaying it. For a fixed target bandwidth, a finer quantization is applied on the low-resolution sequence compared to high-resolution, so that the high quality reconstructed pixels help in retrieving the lost information. However, most CNN-based SR methods are designed for single images and merely focus on the original input signal. Therefore, their trained networks lack understanding of compression artifacts. In this study, we test a hypothesis that training CNN-based SR methods with compressed sequences outperforms training with uncompressed ones. The assumption is that such training allows the SR methods to learn compression artifacts and differentiate them from actual texture information. To this end, stateof-the-art CNN-based SR methods are tested with compressed and uncompressed training set. Experiments show that the use of compressed training data brings, on average, an additional bitrate saving of 6%, in terms of BD-Rate.
Databáze: OpenAIRE