Weakly-supervised Domain Adaptation for Built-up Region Segmentation in Aerial and Satellite Imagery

The figure displays sample images from both source and target domain datasets. Built-structures in source and all three target datasets are different from eachother. (best viewed in color). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)


Abstract: This paper proposes a novel domain adaptation algorithm to handle the challenges posed by the satellite and aerial imagery, and demonstrates its effectiveness on the built-up region segmentation problem. Built-up area estimation is an important component in understanding the human impact on the environment, effect of public policy and in general urban population analysis. The diverse nature of aerial and satellite imagery (capturing different geographical locations, terrains and weather conditions) and lack of labeled data covering this diversity makes machine learning algorithms difficult to generalize for such tasks, especially across multiple domains. Re-training for new domain is both computationally and labor expansive mainly due to the cost of collecting pixel level labels required for the segmentation task. Domain adaptation algorithms have been proposed to enable algorithms trained on images of one domain (source) to work on images from other dataset (target).Unsupervised domain adaptation is a popular choice since it allows the trained model to adapt without requiring any ground-truth information of the target domain. On the other hand, due to the lack of strong spatial context and structure, in comparison to the ground imagery, application of existing unsupervised domain adaptation methods results in the sub-optimal adaptation. We thoroughly study limitations of existing domain adaptation methods and propose a weakly-supervised adaptation strategy where we assume image level labels are available for the target domain. More specifically, we design a built-up area segmentation network (as encoder-decoder),with image classification head added to guide the adaptation. The devised system is able to address the problem of visual differences in multiple satellite and aerial imagery datasets, ranging from high resolution (HR) to very high resolution (VHR), by investigating the latent space as well as the structured output space.

A realistic and challenging HR dataset is created by hand-tagging the 73.4 sq-km of Rwanda, capturing a variety of build-up structures over different terrain. The developed dataset is spatially rich compared to existing datasets and covers diverse built-up scenarios including built-up areas in forests and deserts, mud houses, tin and colored rooftops. Extensive experiments are performed by adapting from the single-source domain datasets, such as Massachusetts Buildings Dataset, to segment out the target domain. We achieve high gains ranging 11.6–52%in IoU over the existing state-of-the-art methods.

The figure demonstrates the proposed LT-WAN architecture for built-up regions segmentation. A fully convolutional discriminator network is trained using X_G. The segmentation network is optimized with L_adv loss from discriminator network in latent space,L_seg loss in output space and L_H_d loss in both latentand output space.

Rwanda Built-up Region Segmentation Dataset:

We create Rwanda built-up regions dataset, a different and versatile in nature from previously available datasets. The varying structure size and formation, irregular patterns of construction, buildings in forests and deserts, and the existence of mud houses makes it very challenging. A total of 787 satellite images of size 256 × 256 are collected at high resolution (HR) of 1.193 meter per pixel  and hand tagged for built-up region segmentation using an online tool Label-Box. The difference between Rwanda images and other existing datasets images is shown in Fig. 1 which makes it very challenging for an adaptation algorithm. The variations in visual appearances and terrain are successfully captured by our proposed adaptation method as shown in table and images below.

Qualitative Results:

Output of built-up area segmentation when adapted from Massachusetts to Village-Finder, Potsdam and Rwanda datasets respectively. Columns from left to right represent Target Image, Ground Truth, Source Only, OSA: output space adaptation (Tsai et al., 2018), LTA: latent space adaptation (Chen et al., 2017; Benjdira et al., 2019), OS-WAN and LT-WAN respectively. The proposed LT-WAN and OS-WAN mitigates the false segmentation posed by OSA and LSA and improves the true positive rate significantly

Quantitative Results:


PDF: Weakly-supervised Domain Adaptation for Built-up Region Segmentation

Github: Weakly-Supervised Adaptation Network

Dataset: Rwanda Built-up Region Segmentation Dataset


Semantic Segmentation, Domain Adaptation,  Weakly-supervised Domain Adaptation, Satellite Imagery, Remote Sensing




title={Weakly-supervised domain adaptation for built-up region segmentation in aerial and satellite imagery},

author={Iqbal, Javed and Ali, Mohsen},

journal={ISPRS Journal of Photogrammetry and Remote Sensing},




publisher={Elsevier} }


Contact: javed.iqbal@itu.edu.pk