Computer Vision and Machine Learning Research Group


Computer Vision and Machine Learning (CVML) research group address the wide range of problems including Affective Computing, Remote Sensing, Object Detection, Semantic Segmentation and Anomaly Detection. The group has been successfully exploring and implementing novel techniques using computer vision and machine/deep learning algorithms. Our research work handles the theoretical and practical aspects in the area of deep learning and eager to solve locally relevant problems by using learning methods. CVML is motivated to design and apply the intelligent algorithms that efficiently perform the computer vision tasks.

Relevant faculty can be found here.

Ongoing Projects

Deep Learning

We are deploying Deep Learning algorithms. Deep Learning is a new area of Machine Learning, moving Machine Learning closer to one of its original goals: Artificial Intelligence.

Project Name Description
Anamoly Detection In addition to proposing a new anomaly detection method, we introduce a new large-scale first of its kind dataset of 128 hours of videos. It consists of 1900 long and untrimmed real-world surveillance videos, with 13 realistic anomalies such as Abuse, Arrest, Arson, Assault, Accident, Burglary, Explosion, Fighting, Robbery, Shooting as well as normal activities.
Please check out our CVPR 2018 paper:

PI: Dr. Waqas Sultani



Object Detection As the show of firearm in public places is a strong indicator of potential crime activity, therefore, the deployment of firearm detectors in current video surveillance systems will not only aid law-enforcement agencies in ensuring human safety but it will also reduce burden from human operators to watch live feed of every second and detect suspicious events.

Work in progress

PI: Dr. Mohsen Ali


Moving Object Segmentation in Videos Video object segmentation aims at clustering pixels in videos into objects or background. In this project, instead of treating deep learning as a black box and fixating on infinite iterations on the network design, we have focused on more intelligent and informative input. We are exploiting the geometrical constraints between the frames to recover what is similar across the frames. These constraints have been studied extensively and unlike the data-dependence learning, they have closed-form solution (for the given correspondences in the images).
In Collaboration with: Dr Ijaz Postdoc Researcher at NUS
Work in progress

PI: Dr. Mohsen Ali


Domain Adaptation of Semantic Segmentation

Semantic segmentation is a challenging problem due to pixel level annotations requirement. Deep Convolutional Neural Networks (DCNNs) are performing with tremendous results on Semantic Segmentation problem but there are still limitation of training data for real-time applications. Domain Adaptation of Semantic segmentation tries to adapt the target domain data distribution without knowing labels to effectively do semantic segmentation in real-time scenarios. Generative adversarial Networks are also incorporated to learn the distribution of both the source and target data simultaneously and minimize their difference.

PI: Dr. Mohsen Ali


Remote Sensing We at ITU are studying satellite and aerial imagery to develop tools that will assist Government and Non-Government organization in analyzing urban population, road structure, urban and rural structural development, agricultural regions, animal migrations and destruction caused by natural disasters. Primarily we have studied satellite and aerial images in order to detect residential areas using computer vision and machine learning techniques.

PI: Dr. Mohsen Ali


Affective Computing Affective Computing is a research field that addresses this problem and affectively analyze the multimedia to build emotionally intelligent machines, capable of better human machine interaction. The exciting applications of affective computing include affective content recommendation, abstraction and affective description generation. Affective content analysis also helps us find the reason why a specific content is evoking particular emotion in its viewers.

PI: Dr. Mohsen Ali


Medical Image Analysis

Vision impairment/loss is one of the major diseases in the world and according to World Health Organization (WHO), more than 285 million people are suffered from vision impairment, from which more than 39 million are blind. In this project we predict the blindness using retinal layers images containing Choroidal Neovascular (CNV) and Diabetic Macular Edema (DME), DRUSEN is for vision weakness and Normal is for observation only. This prediction of blindness results into anti Vascular Endothelial Growth Factor (VEGF) treatment, which stops the retinal disease and make retina properly working for vision.

I = {(x2, y2 ), (x2, y2),…, (xn, yn)}         xi ε Rd
Ylabels ε {0CNV, 1DME, 2DRUSEN, 3NORMAL}

PI: Dr. Mohsen Ali

Relevant Projects

Relevant projects can be found here