International Journal of Remote Sensing, 2021
Taylor & Francis
Capturing global contextual representations in remote sensing images by exploiting long-range pixel-pixel dependencies has been shown to improve segmentation performance.
However, how to do this efficiently is an open question as current approaches of utilising attention schemes, or very deep models to increase the field of view, increases complexity and memory consumption. Inspired by recent work on graph neural networks, we propose the Self-Constructing Graph (SCG) module that learns a long-range dependency graph directly from the image data and uses it to capture global contextual information efficiently to improve semantic segmentation. The SCG module provides a high degree of flexibility for constructing segmentation networks that seamlessly make use of the benefits of variants of graph neural networks (GNN) and convolutional neural networks (CNN). Our SCG-GCN model, a variant of SCG-Net built upon graph convolutional networks (GCN), performs semantic segmentation in an end-to-end manner with competitive performance on the publicly available ISPRS Potsdam and Vaihingen datasets, achieving a mean F1-scores of 92.0% and 89.8%, respectively. We conclude that the SCG-Net is an attractive architecture for semantic segmentation of remote sensing images since it achieves competitive performance with much fewer parameters and lower computational cost compared to related models based on convolutional neural networks.