Deep Learning – Spring 2020
Class HoursTue: 5:30 pm — 7:00 pm, Thur: 7:15 pm — 8:45 pm Office Hours and Contact Info.Instructor: Mohsen Ali |
Course BasicsCore Course PrerequisiteEnthusiasm, Energy and Imagination |
Course Overview
We are going to take “get your hands dirty” approach, you will be given assignments and projects to implement ideas discussed in the class. Projects and assignments will contain miniature versions of real life existing applications and problems (e.g can you train your computer to generate dialogues in Shakespeare style or convert your image into painting as done by Monet, sentiment analysis etc.. ).
Course will concentrate in developing both mathematical knowledge and implementation capabilities. We will start from training a single perceptron, move to training a deep neural network, study why training large networks is a problem and what could be its possible solutions. After dipping our toes in deep belief networks and recurrent neural network we will start looking into applications of deep learning in three different areas, text-analysis, speech processing and computer vision. Objective of this approach is to make you comfortable enough that you can understand various research problems and, if interested, can implement deep learning based applications.
Course Objectives
In last few years machine learning has matured from the science fiction to reality. We are living in a world where we have already seen industry bringing to reality self-driving cars, face-recognizers that work on massive scale (facebook), speech translation systems that can translate from one language to many other simultaneously and in real-time, and more interestingly we have machines that can learn to play atari games in a similar fashion like we do.
A lot of these victories have come from the exciting field of Deep Learning; a learning methodology based on the concept that human mind captures details at multiple levels or at multiple abstract levels. One property of deep learning is removing the responsibility of humans to design features, instead Deep Learning is given a task to find the appropriate representation.
Grading Policy
- 45% Assignments
- 5% Class participation and Creating Notes
- 20% Final Project
- 10% Quizzes
- 10% Midterm Exam
- 10% Final Exam
Honor Code
All cases of academic misconduct will be forwarded to the disciplinary committee. All assignments are group based unless explicitly specified by the instructor. In the words of Efros, let’s not embarrass ourselves.
Tentative and Rough Course Outline
Weeks | Topics | Evaluations |
1 | Introduction to Deep Learning
Difference between Machine Learning and Deep Learning Basic Machine Learning: Linear & Logistic Regression, |
|
2 | Supervised Learning with Neural Networks
Deep Learning, Single and Multi-Layer Neural Networks, Perceptron Rule, Gradient Descent, Backpropagation, Loss Functions Tutorial 1: Python/Numpy Tutorial/Anaconda |
Assignment 1 |
3 | Hyperparameters tuning, Regularization and Optimization
Parameters vs Hyperparameters, Why regularization reduces overfitting? Data Augmentation, Vanishing/Exploding gradients, Weight Initialization Methods, Optimizers Tutorial 2: Building a Linear Classifier |
Assignment 2 |
4 | Convolutional Neural Networks
Convolutional Filters, Pooling Layers, Classic CNNs: AlexNet, VGG, GoogleNet, ResNet, DenseNet. Transfer Learning Tutorial 4: CNN Visualization |
Assignment 3
& Assignment 4 |
5 | Deep Learning for Vision Problems
Object Localization & Detection, Bounding box predictions, Anchor boxes, Region Proposal Networks, Detection Algorithms: RCNN, Faster RCNN, Yolo, SSD. Tutorial 5: Caffe & Object Detection |
Assignment 5 |
6 | Sequence Models
Recurrent Neural Networks (RNN), Gate Recurrent Unit (GRU), Long Short Term Memory (LSTM), Bidirectional RNN, Backpropagation through time. Image Caption Generation, Machine Translation, Text Generation & Summarization Tutorial 6: Image Captioning & Text Generation |
|
7 | Auto-Encoders & Generative Models
Variational Auto-Encoders, Stacked Auto-Encoders, Denoising Auto-Encoders, Concept of Generative Adversarial Networks (GANs) |
|
8 | Miscellaneous
Capsule Networks, Convolutional LSTM, Attention Networks, Restricted Boltzmann Machine, One Shot Learning, Siamese Networks, Triplet Loss, Graph CNN, Approximate and Energy Efficient Design for Deep CNN (Dr. Rehan Hafiz) |
Course Notes
Date | Topics | Notes / Reading Material / Comments | News |
4th Feb 2020 | Introduction | 45% Assignments
20% Final Project 5% Class participation and Creating Notes 10% Quizzes 10% Midterm Exam 10% Final Exam |
|
Recommended Resources |
|
||
6th Feb 2020 | Supervised Learning |
Assigned Readings:
Recommended Readings:
Refresh
|
|
11th and 13th Feb 2020 | Logistic Regression |
Assigned Readings:
Recommended Reading on Logistic Regression Recommended Reading: Softmax and Cross Entropy Assigned Readings (Linear Algebra):
|
|
18th Feb 2020 | Gradient Descent |
Recommended Readings: |
Posted Assignment 1: Linear Regression with and without Gradient Descent |
20th Feb 2020 | Optimization |
Recommended Readings:
|
|
25th Feb 2020 | Neural Network |
Recommended Readings: Perceptron Rule
Assigned Readings:
Optional: |
Posted Assignment 2: Implementation of Neural Network |
27th Feb 2020 | Textures and Filters |
|
|
10th March | CNN continued |
NOTES Assigned Readings Recommended Readings |
|
12th March 2020 | CNN continued |
Assigned Reading
Recommended Readings
|
Posted Assignment 3: Implementation of Convolutional Neural Networks (Forward Propagation only) |
19th March | CNN
Backpropagation |
Reference
Video Lecture: |
Posted Assignment 4: Implementation of Convolutional Neural Networks with Backpropagation |
Start of Online Lectures after 3 Week break due to Covid-19 outbreak | |||
7th April 2020 | NN training |
|
|
9th and 14th April 2020 | Batch Normalization and Regularization |
Assigned Readings
Video Lectures
|
|
16th April 2020 | Receptive Field |
Assigned Readings |
|
21st April 2020 | Popular CNN architectures |
|
Posted Assignment 5 – Part 1: Detecting Coronavirus Infections through Chest X-Ray images |
23rd April 2020 | Popular CNN architectures (continued) |
Assigned Reading
Recommended Readings
|
|
24th April 2020 (Tutorial) | PyTorch Tutorial Session |
|
|
30th April 2020 | Semantic Segmentation |
Interesting Reading |
Posted Assignment 5 – Part 2: Focal Loss for Handling Class Imbalance in Detecting Coronavirus Infections through Chest X-Ray images |
5th May 2020 | Object Detection |
|
|
12th May 2020 | Computer Vision Problems (Continued) |
|
|
14th May 2020 | Sequence Modelling |
|
|
19th May 2020 | RNNs (continued) |
Assigned Readings
|
|
21st May 2020 | Autoencoders |
Hugo Larochelle’s lectures are treat – Slides Reading |
|
2nd June 2020 | Autoencoders (continued) |
|
|
4th June 2020 | GANs |
|
|
9th June 2020 | Paper Presentations |
|
|
11th June 2020 | Presentations (continued) |
|
|
15th June 2020 | Presentations (continued) |
|
|
16th June 2020 | Presentations (continued) |
|
Posted Assignment 6 (Optional): Preparing Slides for Advanced Topics in Deep Learning |
End of Lectures |
Text Book
- Text Book: Deep Learning by Ian Goodfellow Link
Recommended Readings
There is not assigned textbook, however following are recommended for reading.
Assignments
-
Assignment 1: Linear Regression with and without Gradient Descent:
-
Assignment 2: Implementation of Neural Network
-
Assignment 3: Implementation of Convolutional Neural Networks
-
Assignment 4: Implementation of CNNs with Backpropagation
-
Assignment 5 – Part 1: Detecting Coronavirus Infections through Chest X-Ray Images
-
Assignment 5 – Part 2: Focal Loss for Handling Class Imbalance in Detecting Coronavirus Infections through Chest X-Ray images
-
Assignment 6: Preparing Slides for Advanced Topics in Deep Learning
Projects
Project Related Paper Presentations
Group ID | Group Members | Project Title | Paper Details |
G1A |
|
Handwritten Urdu Keyword Recognition using Capsule Networks | Kosiorek, Adam, Sara Sabour, Yee Whye Teh, and Geoffrey E. Hinton. “Stacked capsule autoencoders.” In Advances in Neural Information Processing Systems, pp. 15486-15496. 2019. |
G1B |
|
Malaria Detection and classification in microscopic images | Feng Yang, Nicolas Quizon, Hang Yu, Kamolrat Silamut, Richard J. Maude, Stefan Jaeger, Sameer Antani, “Cascading YOLO: automated malaria parasite detection for Plasmodium vivax in thin blood smears,” Proc. SPIE 11314, Medical Imaging 2020: Computer-Aided |
G2B |
|
Malaria Detection and classification in microscopic images | Vijayalakshmi, A. “Deep learning approach to detect malaria from microscopic images.” Multimedia Tools and Applications (2019): 1-21. |
G3B |
|
Malaria Detection and classification in microscopic images | Rajaraman, Sivaramakrishnan, Stefan Jaeger, and Sameer K. Antani. “Performance evaluation of deep neural ensembles toward malaria parasite detection in thin-blood smear images.” PeerJ 7 (2019): e6977. |
G1D |
|
Drug Discovery | Jiang, Mingjian, Zhen Li, Shugang Zhang, Shuang Wang, Xiaofeng Wang, Qing Yuan, and Zhiqiang Wei. “Drug–Target Affinity Prediction Using Graph Neural Network and Contact Maps,” June 1, 2020. https://doi.org/10.1039/D0RA02297G. |
G3E |
|
Object Detection in the dark/night | Wei, Chen, et al. “Deep retinex decomposition for low-light enhancement.” arXiv preprint arXiv:1808.04560 (2018). |
G2G |
|
Domain Adaptation for Emotion Detection from Face Expressions (Western to Pakistani Dramas & Talk Shows) | Wang, Xiaoqing, Xiangjun Wang, and Yubo Ni. “Unsupervised domain adaptation for facial expression recognition using generative adversarial networks.” Computational intelligence and neuroscience 2018 (2018). |
G3G |
|
Domain Adaptation for Emotion Detection from Face Expressions (Western to Pakistani Dramas & Talk Shows) | Kalischek, Nikolai, Patrick Thiam, Peter Bellmann, and Friedhelm Schwenker. “Deep Domain Adaptation for Facial Expression Analysis.” In 2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW), pp. 317-323. IEEE, 2019. |
G1H |
|
COVID-19 Tweets Analysis | M. Jabreel and Antonio Moreno. “A Deep Learning-Based Approach for Multi-Label Emotion Classification in Tweets.” In the Journal of Multidisciplinary Digital Publishing Institute (MDPI). March ’19, 1-16. |
G2H |
|
COVID-19 Tweets Analysis | Lai, Siwei, Liheng Xu, Kang Liu, and Jun Zhao. “Recurrent convolutional neural networks for text classification.” In Twenty-ninth AAAI conference on artificial intelligence. 2015. |
G3H |
|
COVID-19 Tweets Analysis | Radford, Alec, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. “Language models are unsupervised multitask learners.” OpenAI Blog 1, no. 8 (2019): 9. |
G1J |
|
Urdu Optical Character Recognition for Twitter screenshots | Jain, Mohit, Minesh Mathew, and C. V. Jawahar. “Unconstrained ocr for urdu using deep cnn-rnn hybrid networks.” In 2017 4th IAPR Asian Conference on Pattern Recognition (ACPR), pp. 747-752. IEEE, 2017. |
G1K |
|
Urdu Caption Generation | Liu, Xihui, Hongsheng Li, Jing Shao, Dapeng Chen, and Xiaogang Wang. “Show, tell and discriminate: Image captioning by self-retrieval with partially labeled data.” In Proceedings of the European Conference on Computer Vision (ECCV), pp. 338-354. 2018. |
G2K |
|
Urdu Caption Generation | Li, Yikang, Wanli Ouyang, Bolei Zhou, Kun Wang, and Xiaogang Wang. “Scene graph generation from objects, phrases and region captions.” In Proceedings of the IEEE International Conference on Computer Vision, pp. 1261-1270. 2017. |
G3K |
|
Urdu Caption Generation | Xu, Kelvin, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. “Show, attend and tell: Neural image caption generation with visual attention.” In International conference on machine learning, pp. 2048-2057. 2015. |
G1L |
|
Urdu Text Detection | Liu, X., Liang, D., Yan, S., Chen, D., Qiao, Y., & Yan, J. (2018). Fots: Fast oriented text spotting with a unified network. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5676-5685). |
G1N |
|
Deep Fundamental Matrix Estimation or Depth Estimation | Godard, Clement, Oisin Mac Aodha, and Gabriel J. Brostow. “Unsupervised Monocular Depth Estimation with Left-Right Consistency.” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. |
G3N |
|
Deep Fundamental Matrix Estimation or Depth Estimation | Ranftl, René, and Vladlen Koltun. “Deep fundamental matrix estimation.” In Proceedings of the European Conference on Computer Vision (ECCV), pp. 284-299. 2018. |
G1O |
|
Urban analysis through Satellite Imagery | Batra, Anil, Suriya Singh, Guan Pang, Saikat Basu, C. V. Jawahar, and Manohar Paluri. “Improved road connectivity by joint learning of orientation and segmentation.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 10385-10393. 2019. |
G2O |
|
Urban analysis through Satellite Imagery | Jean, Neal, Marshall Burke, Michael Xie, W. Matthew Davis, David B. Lobell, and Stefano Ermon. “Combining satellite imagery and machine learning to predict poverty.” Science 353, no. 6301 (2016): 790-794. |
G3O |
|
Urban analysis through Satellite Imagery | Pan, Zhuokun, Jiashu Xu, Yubin Guo, Yueming Hu, and Guangxing Wang. “Deep Learning Segmentation and Classification for Urban Village Using a Worldview Satellite Image Based on U-Net.” Remote Sensing 12, no. 10 (2020): 1574. |
G2F |
|
Pakistani Avatar Generation using GANs | Kim, Junho, Minjae Kim, Hyeonwoo Kang, and Kwanghee Lee. “U-GAT-IT: unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation.” arXiv preprint arXiv:1907.10830 (2019). |
G3F |
|
Pakistani Avatar Generation using GANs | Tang, Hao, Hong Liu, Dan Xu, Philip HS Torr, and Nicu Sebe. “Attentiongan: Unpaired image-to-image translation using attention-guided generative adversarial networks.” arXiv preprint arXiv:1911.11897 (2019). |
G1S |
|
Brain Hemorrhage Detection | Chilamkurthy, Sasank, Rohit Ghosh, Swetha Tanamala, Mustafa Biviji, Norbert G. Campeau, Vasantha Kumar Venugopal, Vidur Mahajan, Pooja Rao, and Prashant Warier. “Deep learning algorithms for detection of critical findings in head CT scans: a retrospective study.” The Lancet 392, no. 10162 (2018): 2388-2396. |
G1T |
|
Food Analysis on Hyperspectral Imagery using Deep Learning | Liu, Shengjie, Haowen Luo, Ying Tu, Zhi He, and Jun Li. “Wide contextual residual network with active learning for remote sensing image classification.” In IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium, pp. 7145-7148. IEEE, 2018. |
G1U |
|
A novel modelling approach for all-dielectric metasurfaces using deep neural networks | An, Sensong, Clayton Fowler, Bowen Zheng, Mikhail Y. Shalaginov, Hong Tang, Hang Li, Li Zhou et al. “A novel modeling approach for all-dielectric metasurfaces using deep neural networks.” arXiv preprint arXiv:1906.03387 (2019). |
G1V |
|
Prostate Cancer Grade Assessment | Nagpal, Kunal, Davis Foote, Yun Liu, Po-Hsuan Cameron Chen, Ellery Wulczyn, Fraser Tan, Niels Olson et al. “Development and validation of a deep learning algorithm for improving Gleason scoring of prostate cancer.” NPJ digital medicine 2, no. 1 (2019): 1-10. |
G1W |
|
Imitation Learning On Atari Games Using GAIL | Ho, Jonathan, and Stefano Ermon. “Generative adversarial imitation learning.” In Advances in neural information processing systems, pp. 4565-4573. 2016. |
G1X |
|
Attention based Multiple Instance learning for medical image analysis | Lu, Ming Y., Drew FK Williamson, Tiffany Y. Chen, Richard J. Chen, Matteo Barbieri, and Faisal Mahmood. “Data Efficient and Weakly Supervised Computational Pathology on Whole Slide Images.” arXiv preprint arXiv:2004.09666 (2020). |
Toolkits
Caffe | TensorFlow |
Torch | Keras |
Some Interesting Links
- Linear algebra review / primer by Martial Hebert
- Some of the research groups working with commercial entities
- Machine Learning Group – Geoffrey Hinton
- New York University – Yann Lecun
- Stanford University – Andrew Ng, Fei-fei Li‘s groups
- Microsoft Research
- Google DeepMind – Alex Graves
- Amazon Research