A Large-scale Multi Domain Leukemia Dataset for the White Blood Cells Detection with Morphological Attributes for Explainability

(Published at MICCAI 2024)

Abdul Rehman, Talha Meraj, Aiman Mahmood Minhas, Ayisha Imran, Mohsen Ali, Waqas Sultani,
Intelligent Machines Lab, Information Technology University, Pakistan
pdf

Abstract

Earlier diagnosis of Leukemia can save thousands of lives annually. The prognosis of leukemia is challenging without the morphological information of White Blood Cells (WBC) and relies on the accessibility of expensive microscopes and the availability of hematologists to analyze Peripheral Blood Samples (PBS). Deep Learning based methods can be employed to assist hematologists. However, these algorithms require a large amount of labeled data, which is not readily available. To overcome this limitation, we have acquired a realistic, generalized, and large dataset. To collect this comprehensive dataset for real-world applications, two microscopes from two different cost spectrums (high-cost HCM and low-cost LCM) are used for dataset capturing at three magnifications (100x, 40x, 10x) through different sensors (high-end camera for HCM, middle-level camera for LCM and mobile-phone camera for both). The high-sensor camera is 47 times more expensive than the middle-level camera and HCM is 17 times more expensive than LCM. In this collection, using HCM at high resolution (100x), experienced hematologists annotated 10.3k WBC types (14) and artifacts, having 55k morphological labels (Cell Size, Nuclear Chromatin, Nuclear Shape, etc.) from 2.4k images of several PBS leukemia patients. Later on, these annotations are transferred to other 2 magnifications of HCM, and 3 magnifications of LCM, and on each camera captured images. Along with the LeukemiaAttri dataset, we provide baselines over multiple object detectors and Unsupervised Domain Adaptation (UDA) strategies, along with morphological information-based attribute prediction. The dataset will be publicly available after publication to facilitate the research in this direction.

Contribution

A large-scale multi-domain WBC detection benchmark along with morphological attributes of WBCs for prognosis of leukemia is introduced morphological attributes, recommended by hematologists for prognosis of leukemia, To facilitate future research, we have constructed extensive WBC’s detection and UDA baselines, A multi-headed WBC detection and morphological attribute prediction architecture are introduced.

Method

To achieve explainable WBC detection, we propose to use a multi-headed WBC detector. We firstly apply recent object detectors from different domains including one-stage, YOLOv5, two-stage (Spare-Faster-RCNN) detectors, and transformer (DINO). We chose these methods because of their good detection results, efficiency, and low memory consumption. Due to the overall better result of YOLOv5 on our datasets, we have extended the YOLOv5 for attribute prediction. In YOLOv5 architecture, we added a lightweight attribute head for the prediction of WBC’s morphology. The architecture of attribute head is based on a small Convolutional Neural Network(CNN). For attribute prediction, we want to capture low-level visual details, therefore, we fuse features from two initial layers which caries structural and semantically enriched information. Specifically, we extract the different level features from YOLOv5x-backbone, extensive experiments were performed based upon different combinations of feature fusion after performing ROI align. The experimental results revealed that initial two layer features contain structural and semantically enriched information which are effective for attribute prediction.
To train the attribute head with YOLOv5 heads, asymmetric loss is employed. The YOLOv5 method-based object detections and attribute head prediction collectively predict the cell and its morphology which gives the explainable reasoning.
These predictions have been registered in a WBC morphology bank and blood film level descriptions have been generated based on the most frequently appearing WBC type (recommended by hematologists). As in the collected dataset, the blood film level data have been captured using a microscope, therefore a blood film level explainability can be produced as shown in Figure 1, which is a helpful second opinion for the hematologists

We discussed AttriDet produced explainability with the hematologists who appreciated this explanation and see our method as a potential candidate for a second opinion of leukemia prognosis. Furthermore, the presented AttriDet not only predicts associated attributes to a cell but also increases the YOLOv5 predictability (improved mAP from 26.3 to 28.2).

LeukemiaAttri Dataset

To gain a comprehensive understanding of the Leukemia prognosis and its impact, we discussed with several healthcare professionals from different working environments and finalized the WBC types and their morphological attributes. In this dataset collection, the PBFs are utilized from the diagnostic lab, ensuring patient consent, and incorporating the hematologist’s marks in the region from the monolayer area. To capture images, we utilize two distinct microscopes the high-cost (Olympus CX23) and the low-cost (XSZ-107BN) – in conjunction with two separate cameras, namely the HD1500T (HCM), ZZCAT 5MP (LCM) and the Honor 9x Lite mobile camera (HCM, LCM). It is quite challenging to locate the same patch on the PBF when employing different microscopes and resolutions. To address this inherent challenge, we initiated the capturing process by setting the field of view (FoV) at 10x, and 40x with an approximate 20% overlap, maintaining a fixed x-axis stage scale. At 100x magnification, we captured the FOV containing WBCs without any overlap, ensuring the distinct representation of individual WBCs. This process was repeated both for HCM and LCM.

 

Morphological attributes

The set of rules for WBC morphology varies depending on the hematologists. To enhance prognostic assistance, hematologists
identified the 14 types of WBC and considered seven key morphological attributes for a well-informed prognosis. To annotate the WBC type and morphology attributes, hematologists reviewed the subsets of the captured images and then selected the most structural and best quality images from the given subset (HD1500T paired camera at 100x using HCM) of the LeukemiaAttri dataset. For quality control, two hematologists annotate each cell with the consultation. The detail of some types of WBC with the morphological information is shown in the Figure 2 , where A) monoblasts, B) monocytes, and C) myelocytes cells exhibit mostly similar morphological characteristics as they originate from the myeloid lineage.

Nevertheless, differences arise, particularly in the presence of cytoplasmic vacuolation. However, D) lymphoblasts belong to a lymphoid lineage that shows morphological dissimilarities in both lineages. After obtaining detailed WBC and their attributes annotations from hematologist for HCM at 100x, we transferred the annotations to different resolutions and across microscope automatically using homography. Transferred annotations were verified manually and re-annotatation was done for the missing localization. The detailed count of the source subset of WBC types and their corresponding attributes are shown in Figure.
About Authors

Abdul Rehman

PhD Fellow, Intelligent      Machines Lab, ITU, Lahore, Pakistan, ITU, Lahore,  Pakistan

Email: phdcs23002@itu.edu.pk
LinkedIn: Abdul Rehman

Talha Meraj

Research Associate,
Intelligent Machines Lab, ITU, Lahore, Pakistan
 
Email: talha.meraj@itu.edu.pk
LinkedIn: Talha Meraj

Aiman Mahmood Minhas

Hematologist, Department of hematology, Chughtai lab, Lahore, Pakistan
 

Ayesha Imran

Consultant, Department of hematology, Chughtai lab, Lahore, Pakistan
 

Mohsen Ali

Associate Professor Tenured, Department of AI, ITU,  Pakistan
 
LinkedIn: Mohsen Ali

Waqas Sultani

Assistant Professor, Department of AI, ITU, Lahore, Pakistan
 
LinkedIn: Waqas Sultani

BibTex

@inproceedings{rehman2024large,
  title={A large-scale multi domain leukemia dataset for the white blood cells detection with morphological attributes for explainability},
  author={Rehman, Abdul and Meraj, Talha and Minhas, Aiman Mahmood and Imran, Ayisha and Ali, Mohsen and Sultani, Waqas},
  booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention},
  pages={553--563},
  year={2024},
  organization={Springer}
}