Mapping Temporary Slums From Satellite Imagery Using a Semi-Supervised Approach

(Published at IEEE Geoscience and Remote Sensing Letters 2022)

M. Fasi Ur Rehman,   Izza Aftab,   Waqas Sultani,   Mohsen Ali
Information Technology University, Pakistan

Fig. 1. (a) We have used U-Net architecture as a segmentation model. Two 1-D fully connected layers of 256 and 64 units are appended at the end of the centermost layer of U-Net. The arrows from the centermost layer of U-Net to the dense layers indicate the flow of feature maps. (b) Converting pixel-level labels to image-level labels: y =1 means the image is labeled as a slum. In contrast, y = 0 means the image is labeled as non-slum, and µy is the mean value of the segmentation mask or the ratio of area covered by pixels labeled as a slum. (c) 64-Dimensional embedding for labeled and unlabeled data. (d) Scoring module is based on cosine similarity between labeled and unlabeled embeddings. (e) Scoring module returns two lists of the sorted list, one for each class, i.e., slum and non-slum. Top K images are selected from each list.

Abstract

One billion people worldwide are estimated to be living in slums, and documenting and analyzing these regions is a challenging task. When compared with regular slums; the small, scattered, and temporary nature of temporary slums makes data collection and labeling tedious and time-consuming. To tackle this challenging problem of temporary slums detection, we present a semi-supervised deep learning segmentation-based approach; with the strategy to detect initial seed images in zero-labeled data settings. A small set of seed samples (32 in our case) are automatically discovered by analyzing the temporal changes, which are manually labeled to train a segmentation and representation learning module. The segmentation module gathers high-dimensional image representations, and the representation learning module transforms image representations into embedding vectors. After that, a scoring module uses the embedding vectors to sample images from a large pool of unlabeled images and generates pseudo-labels for the sampled images. These sampled images with their pseudo-labels are added to the training set to update the segmentation and representation learning modules iteratively. To analyze the effectiveness of our technique, we construct a large geographically marked dataset of temporary slums. This dataset constitutes more than 200 potential temporary slum locations (2.28 km2) found by sieving 68000 images from 12 metropolitan cities of Pakistan covering 8000 km2. Furthermore, our proposed method outperforms several competitive semi-supervised semantic segmentation baselines on a similar setting. The code and the dataset will be made publicly available.

RESULTS

Comparison of the segmentation results of all the methods mentioned in Table I. The segmentation masks generated by the respective methods are overlayed over the satellite imagery.

Pdf, Code and Results

Click on image to view it

Neurocomputing 2022

Authors' Information

M. Fasi Ur Rehman:
Email:

Izza Aftab

Email:

Waqas Sultani:
Email:

Dr. Mohsen Ali, Assistant Professor, Intelligent Machines Lab, ITU, Lahore, Pakistan
Email: mohsen.ali@itu.edu.pk
Web: https://im.itu.edu.pk/