DiffFuSR: Super-Resolution of all Sentinel-2 Multispectral Bands using Diffusion Models

Publikasjonsdetaljer

Arrangement: (Wien)
År: 2025
Lenker:
- DOI: doi.org/lps25.esa.int/lps25-presentations/presentations/518/_518.pdf
- ARKIV: hdl.handle.net/11250/5344800

The escalating demand for high-resolution Earth Observation (EO) data for various applications has significantly influenced advancements in image processing techniques. This study proposes a workflow to super-resolve the 12 spectral bands of Sentinel-2 Level-2A imagery to a ground sampling distance of 2.5m. The method leverages a hybrid approach, integrating advanced diffusion models with image fusion techniques. A critical component of the proposed methodology is super-resolution of Sentinel-2 RGB bands to generate a super-resolved Sentinel-2 RGB image, which subsequently serves in the image fusion pipeline that super-resolves the remaining spectral bands. The super-resolution algorithm is based on a diffusion model and is trained using the extensive National Agriculture Imagery Program (NAIP) dataset of aerial images, which is freely available. To make the super-resolution algorithm, trained on NAIP images, applicable to Sentinel-2 imagery, image harmonization and degradation were necessary to compensate for the inherent differences between NAIP and Sentinel-2 imagery. To address this challenge, we utilised a sophisticated degradation and harmonisation model that accurately simulates Sentinel-2 images from NAIP data, ensuring the harmonised NAIP images closely mimic the characteristics of Sentinel-2 observations post-resolution reduction.

To investigate if learning the diffusion model using a large dataset of airborne images like NAIP provides better results than learning the model using a smaller satellite-based dataset like WorldStrat of high-resolution SPOT images, we performed a comparative analysis. The results demonstrate that models trained with the harmonised and correctly simulated datasets like NAIP significantly outperform those trained directly on SPOT images but also other existing super-resolution models available. This finding reveals that learning with more data can be beneficial if the data is properly harmonised and degraded to match the Sentinel-2 images. We performed a comprehensive evaluation using the recently established open-SR test methodology to validate the proposed model across multiple super-resolution metrics. This testing framework rigorously evaluates the super-resolution model based on metrics beyond traditional super-resolution metrics such as PSNR, SSIM, and LPIPS. Instead, the open-SR test evaluates the model based on metrics that measure its consistency, synthesis, and correctness. The proposed super-resolution model outperformed several current state-of-the-art models based on the comprehensive open-SR test framework. In addition, visual comparison further established the superior performance of our model in both urban and rural scenarios.

An important component of the proposed model is the super-resolution of all 12 Sentinel-2 Level-2A bands, contrary to previous work, which has mainly focused on RGB band super-resolution. The proposed fusion pipeline successfully utilises the super-resolved image to obtain an enhanced 12-band Sentinel 2 image, similar to pansharpening techniques. We show qualitative and quantitative results on all 12 bands that demonstrate the seamless performance of the fusion method in super-resolution.

This study not only showcases the potential of combining AI-driven super-resolution models with image fusion techniques in enhancing EO data resolution but also addresses the critical challenges posed by the diversity in data sources and the necessity for accurate generative models in training neural networks for super-resolution tasks.