Popis: |
Despite the plethora of successful super-resolution (SR) reconstruction (SRR) models applied to natural images, their application to remote sensing imagery tends to produce poor results. Remote sensing imagery is often more complicated than natural images, has its peculiarities such as being of lower resolution, contains noise, and often depicts large textured surfaces. As a result, applying nonspecialized SRR models like the enhanced SR generative adversarial network (ESRGAN) on remote sensing imagery results in artifacts and poor reconstructions. To address these problems, we propose a novel strategy for enabling an SRR model to output realistic remote sensing images: instead of relying on feature-space similarities as a perceptual loss, the model considers pixel-level information inferred from the normalized digital surface model (nDSM) of the image. This allows the application of better-informed updates during the training of the model which sources from a task (elevation map inference) that is closely related to remote sensing. Nonetheless, the nDSM auxiliary information is not required during production, i.e., the model infers an SR image without additional data. We assess our model on two remotely sensed datasets of different spatial resolutions that also contain the DSMs of the images: the Data Fusion 2018 Contest (DFC2018) dataset and the dataset containing the national LiDAR flyby of Luxembourg. We compare our model with ESRGAN, and we show that it achieves better performance and does not introduce any artifacts in the results. In particular, the results for the high-resolution DFC2018 dataset are realistic and almost indistinguishable from the ground-truth images. |