|
Topics |
Notes / Reading Material / Comments |
News |
| 09th Mar 2021 |
Introduction |
45% Assignments
20% Final Project
5% Class participation and Creating Notes
10% Quizzes
10% Midterm Exam
10% Final Exam |
|
| Recommended Resources |
- Text Book
- Deep Learning by Ian Goodfellow Link
- Dive into Deep Learning by Aston Zhang and co. Link
- Recommended Online Books
-
- Machine Learning, Oxford – Nando de Freitas Link
- Convolutional Neural Networks for Visual Recognition, Stanford (cs231n) Link
- A curated list of courses (Recommended) Link
- Deep Learning for Natural Language Processing, Stanford Link
- Video Lectures
- Essence of Neural Networks – 3Blue1Brown Link
- Convolutional Neural Networks for Visual Recognition, Stanford (cs231n) – Video Lectures Link
- Neural Networks and Deep Learning – deeplearning.ai – Link
|
|
| 11th Mar 2021 |
Un-Supervised and Supervised Learning
Linear Regression |
- Introduction
- Learning: Supervised, unsupervised.
- Learning: discontinuous, continuous
- Models: bias and variance
- Linear Regression
- Error Function in Linear Regression
Assigned Readings:
Recommended Readings:
Refresh:
Concepts of local minima, local maxima, convex functions, concave functions, critical points, chain rule, saddle point |
|
| 16th Mar 2021 |
No Class |
Makeup class will be announced soon |
|
| 18th Mar 2021 |
No Class |
Makeup class(es) will take place on Saturday 27th March, 2021 |
|
| 23rd Mar 2021 |
No Class on account of Pakistan Resolution Day |
Makeup class will be announced soon |
|
| 25th Mar 2021 |
Linear Regression |
-
- Linear Regression
- Where to use the Machine Learning?
- What basic points to ask?
- Problem / objective
- Data
- Model that satisfies the objective
- Loss function
- How to optimize the loss function.
- Error Function in Linear Regression
- What are the convex functions?
- Concept of local and global minima & maxima, saddle point
- Optimization
- Differentiation/gradient
- Critical points
- Differentiation in single variable case
- More than one variable
- Hessian,
- semi-positive definite
- How to find maxima?
Assigned Readings:
Recommended Readings:
|
Assignment 1
|
| 27th Mar 2021 |
Optimization, Gradient Descent and Logistic Regression |
|
|
| 30th Mar 2021 |
Logistic Regression, Classification, Loss Functions |
- Classification
- Linear Classification
- Binary Classification using logistic regression
- Logistic Regression
- Squashing function
- Sigmoid
- Cross Entropy loss function
- Classification and its link with probability
- Maximum likelihood Estimation
- Multi-class classification
Take home task
- Compute gradient of the cross entropy loss function with soft-max.
- Is cross-entropy loss function with soft-max a convex function
Assigned Readings:
Recommended Reading: Softmax and Cross Entropy
Assigned Readings (Linear Algebra):
Linear Algebra for Machine Learning |
|
| 1st Apr 2021 |
Multi Class Classification, Optimization and Gradient Descent |
- Multi-class classification
- Softmax and cross entropy loss.
- Why is softmax used but not sigmoid?
- Optimization of loss function
- Gradient Descent
- Stochastic Gradient Descent
- Batch Gradient descent
- Minibatch Gradient Descent
- Gradient Descent Optimization Algorithms
- Momentum
- Adagrad
- Adadelta
- RMSprop
- Adam
- AdaMax
- Visualization of Algorithms
- Which optimizer to use?
Recommended Readings:
Recommended Readings:
Gradient Descent: Video Lecture from Coursera, Andrew NG |
|
| 2nd Apr 2021 |
Neural Networks |
- Feed-forward Neural Networks
- Perceptron
- OR-Function
- AND-Function
- XOR-Function
- Multiple Perceptron
- Multiple Layer Neural Network
- Nonlinear classification (circle)
- Role of activation functions
- hard-limit, sigmoid, tanh, ReLu, leaky ReLu, MaxOut, ELU
- Input, output and hidden layers
- Why do we need non-linear activation functions?
- Forward pass as matrix multiplication.
- Decision Boundaries
Assigned Readings:
- Chapter 6: Deep Forward Networks; Book: Deep Learning by Ian
Upto complete section 6.3.
Recommended Readings:
Perceptron Rule
|
|
| 3rd Apr 2021 |
Backpropagation |
- How to determine Weights?
- Chain Rule
- Back-propagation Algorithm
- Training Neural Networks
Assigned Readings:
- Chapter 6: Deep Forward Networks; Book: Deep Learning by Ian Goodfellow. Section 6.5 complete.
Recommended Readings
Optional:
How to do backpropagation in a brain by Hinton
Video Lecture:
Lecture 4: Backpropagation; Dhruv Batra
Lecture 10 – Neural Networks, Yaser Abu-Mostafa. |
|
| 6th Apr 2021 |
Neural Network Training |
- How to determine Weights?
- Chain Rule
- Back-propagation Algorithm
- Weights Initialization Techniques
- Activation Functions
- Relu
- Sigmoid
- Tanh
- Swish
- Leaky Relu
- Elu
Assigned Readings:
|
Assignment 2 |
| 8th Apr 2021 |
Weight Initialization and Batch Normalization |
- Weight Initialization
- Random Initialization.
- Initialize Weights using Gaussian Distribution.
- If you are using ReLU activation use Kaiming/MSRA method.
- If you are using tanh activation use Xavier Initialization method.
- Covariate Shift
- During training updating weights of layers affects distribution of each layer’s inputs. This is also called covariate shift.
- Batch Normalization.
- It used to tackle covariate shift problems in data.
- Improves gradient flow.
- Makes deeper networks to train much easier.
- Allow higher learning rates, faster convergence
Assigned Readings
Recommended Reading:
|
|
| 13th Apr 2021 |
Regularization and Dropout |
- Capacity of Network
- Overfitting
- How to avoid overfitting
- Regularization
- Bias/Variance, overfitting, underfitting
- L2 /L1 regularization
- Drop Out
- Concept of stochastic regularization
- Forward / backward
- Help Network in converging.
- Data Augmentation
- Learning Rate decay
Home Work:
Assigned Readings:
Video Lectures
- Deep Learning – Lecture 4 – Nando de Freitas – Link
Recommended Readings:
|
|
| 15th Apr 2021 |
Texture and Convolution filters |
- Texture/ pattern
- Why is texture important?
- How to represent texture?
- Texture can be useful in image recognition
- Convolution
- Convolution of two signals in continuous time
- Convolution in discrete time
- Convolution in image processing
- Image Filtering
- Applying filter on images using convolution
- Box filter
- Sharpening Filters
|
|
| 20th Apr 2021 |
Filters and Convolutional Neural Networks |
- Derivative Filters
- Double Derivative filters
- Gaussian Filters
- Filter Banks
- Filters to detect edges
- Filters to detect bars
- Filters to detect blobs
- What does CNNs learn?
- Convolution is shift Invariant
- Convolutional Neural Networks
- Why do we need convolutional neural networks?
- Building blocks of CNN
- Convolutional Layer
- Pooling Layer
- Activation Function
NOTES
Assigned Readings
Recommended Readings
|
|
| 22th Apr 2021 |
CNNs |
- Making Deep neural Networks using convolution and pooling layers.
- Hyper parameters in CNN
- Number of layers
- Size of features
- Pooling window size
- Stride
- Number of neurons in fully connected layers
- What is a Receptive Field?
- How it is important for object recognition
- Relationship with filter size, depth
- How does a receptive field affect accuracy?
- Increasing receptive field
- Using dilated convolutions
- Max Pooling
- 1×1 Convolution
- Feature Fusion,
- Dimensionality reduction or bottleneck layer.
Assigned Readings
|
|
| 27th Apr 2021 |
Back Propagation in CNNs |
- Backpropagation for Convolutional layer
- Backpropagation for Pooling layer
- Converting a FC layer to the Convolutional Layer
Reference
Video Lecture:
|
Assignment 3 Deliverable 2 |
| 29th Apr 2021 |
Transfer Learning |
- Transfer Learning
- Layers and hierarchical feature learning
- Freezing and Fine-tuning
- Le-Net, AlexNet, VGG, Inception Net, ResNet
- Larger Models and Parameters
- Larger models means more number of layers and deeper models.
- More parameters mean we need large data to train and overfitting if we do not have enough data.
- Larger models have greater learning capacity
- Large number of parameters also increase chances of overfitting.
- Link with receptive field
- 1*1 convolution is used for the receptive field to stay the same.
|
Assignment 3 Deliverable 1 |
| 4th May 2021 |
Transfer Learning and CNN Architectures |
- Padding types
- Vanishing and exploding gradient problem in training of large networks.
- Use of residual Blocks to help in flow of gradients.
- Residual Blocks
- Skip layers
- Using bottle neck layer to reduce computation
- Different Network Types
- AlexNet
- VGG
- Inception
- ResNet
- DenseNet
- GoogLeNet
- Semantic segmentation
- Groups of pixels of same class
- It is classification problem at pixel level
- Fully connected convolutional networks
Assigned Reading
Recommended Readings
|
|
| 6th May 2021 |
Semantic Segmentation |
- CNN for Semantic Segmentation
- Deconvolution
- Upscaling-convolution
- Fully Convolutional Network
- Semantic segmentation
- Groups of pixels of same class
- It is classification problem at pixel level
- Object Detection
- Classification
Assigned Reading
Recommended Readings
|
|
| 18th May 2021 |
Localization and Object Detection |
- Localization
- How localization is different from classification
- Localization as regression problem
- Bounding box regression
- Classification head
- Regression head
- Object Detection
- Object detection as classification and localization
- Using regression and localization head for object detection
- Intersection over Union
- Sliding window method
- Region Proposal method
- Non Maximum Suppression
- Selective search
- Single stage detectors
- Two stage detector
- Interesting Project Topics
- Transformers
- Graph Neural Networks
- Self-supervised learning
- Semi-Supervised Learning
- Zero shot Learning
- Few Shot Learning
Recommended Readings
Yolo, SSD, RCNN, Mask RCNN, Transformers Paper |
Assignment 4
|
| 20th May 2021 |
Object Detection, Recurrent Neural Networks |
- Sliding Window approach without anchors.
- Sliding Window approach with anchors.
- Two-stage detectors
- Single Stage detectors
- Feature Pyramid Networks
- Negative Hard Mining
- Sequence Modelling
- Many to one e.g Sentiment Classification
- Many to many e.g. Translation
- One to many e.g image to caption
- Recurrent Neural Networks
- Output dependent on current input and previous hidden state
- Backpropagation through time
- Vanishing/Exploding gradients
- Gated cells allow previous hidden states to bypass current cell
Assigned Reading
Sequence Modeling: Recurrent and Recursive Nets, Chapter 10 from textbook.
Recommended Readings
Faster RCNN, Fast RCNN, Retina net |
|
| 1st June 2021 |
Language Models |
- Basic Neural Language Model
- Conditional Language Model
- Wordtovec
- One hot encoding
- Distributed vector representations
- Bidirectional RNN
- Transformation Architecture
- Evaluation criteria for caption generation
- Autoencoders
Recommended Readings
|
|
| 3rd June 2021 |
Unsupervised Learning |
- Unsupervised Learning
- Principal Component Analysis
- Clustering
- Kmeans Clustering
- Cluster Centres
- Cluster association
- Coordinate Descent Algorithm
- Generative Models
- Autoregressive generative models
- Pixel RNN paper
- Autoencoders
- Hugo Larochelle’s lectures are treat – Slides
Readings
|
|
| 8th June 2021 |
Autoencoders |
- The Autoencoder Principal
- Output image is same as input image
- First input is encoded than it is decoded
- Latent representation
- Types of different Autoencoder
- Denoising Autoencoder
- Sparse Autoencoder
- Variational Autoencoder
- Variational Inference
|
|
| 10th June 2021 |
Generative Adversarial Networks |
- Generative Adversarial Networks
- Two player game (Generator vs Discriminator)
- ProGAN
- CycleGAN
- StyleGAN
- DCGAN
- WGAN
|
|
| 15th June 2021 |
Do deep networks Generalize? |
- Do Deep nets Generalize?
- Distribution Shift
- Calibration
- Adversarial Examples
- Linear model hypothesis
- Adversarial Attacks
Assigned Readings:
Assigned Video Content:
Recommended Readings:
Wasserstein GAN Blog Post (From GAN to WGAN) |
|
| 17th June 2021 |
Adversarial Attacks |
- Adversarial Attacks
- Changing loss function to deal with them
- Fast Gradient Sign Method
- Transferability of adversarial attacks
- Zero-Shot black-box attack
- Learning based control & imitation
- From prediction to control challenges
- Reinforcement learning
- Agents
- Concept of Reward function
- States
- observations
|
Assignment 5 |
| 24th June 2021 |
Reinforcement Learning |
- Deep learning in board games.
- Q-function: expected total reward
- Q-Learning
- Policy Learning
- Finding the policy that Maximizes the reward
- Policy gradients
- Alpha zero
|
|