Deep Neural Network AIML Handout v1.0-1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

BIRLA INSTITUTE OF TECHNOLOGY & SCIENCE, PILANI

WORK INTEGRATED LEARNING PROGRAMMES


Digital
Part A: Content Design
Course Title Deep Neural Network

Course No(s)

Credit Units 4

Content Authors Dr. Bharatesh


Chakravarthi
Version 1.0

Date September 24th 2023

Course description
DSECCZG524 Deep Neural Networks
Introduction to neural networks, approximation properties, back propagation, deep
network training, regularization and optimization, convolution neural networks,
recurrent neural networks, attention models, transformers, neural architecture search,
federated learning, meta learning, applications in time series modelling and forecasting,
online (incremental) learning.

Pre-requisites:
Mathematical Foundations for Machine Learning
Introduction to Statistical Methods
Machine Learning

Course Objectives
No Course Objective

CO1 Introduce students to the basic concepts and techniques of Deep Learning.

CO2 Students will be able apply deep learning models to applications.

CO3 Students will be able to evaluate deep learning algorithms.


Text Book(s)
T1 Dive into Deep Learning by Aston Zhang, Zack C. Lipton, Mu Li, Alex J. Smola.
https://d2l.ai/chapter_introduction/index.html

Reference Book(s) & other resources


R1 Deep Learning by Ian Goodfellow, Yoshua Bengio, Aaron Courville. MIT Press 2016.

R2 Introduction to Deep Learning by Eugene Charniak. The MIT Press 2019

R3 Deep Learning with Python by Francois Chollet. 1st Edition. Manning Publications
https://livebook.manning.com/book/deep-learning-with-python/part-1/

R4 Deep Learning for Time Series Forecasting by Jason Brownlee

R5 Neural Architecture Search: A Survey by Thomas Elsken, Jan Hendrik Metzen, Frank
Hutter https://arxiv.org/pdf/1808.05377.pdf

Content Structure

1 Fundamentals of Neural Network ( 4 hrs)


1.1 Objective of the course
1.2 Supervised, unsupervised, semi-supervised and reinforcement learning
problems.
1.3 Why Deep Learning?
1.4 Applications of Deep Learning
1.5 Perceptron and Perceptron learning algorithm
1.6 Multilayer Perceptron (MLP)
1.7 MLP as classifiers and Universal approximators
1.8 Issue of Depth and Width

2 Deep Feedforward Neural Networks ( 4 hrs)


2.1 Forward and backward propagation
2.2 Computation graph
2.3 Gradient Descent algorithm
2.4 Impact of depth in DNN

3 Optimization of Deep models ( 2 hrs)


3.1 Challenges in Neural Network Optimization – saddle points and plateau
3.2 Non-convex optimization intuition
3.3 Overview of optimization algorithms
3.4 Mometum based algorithms
3.5 Algorithms with Adaptive Learning Rates

4 Regularization for Deep models ( 2 hrs)


4.1 Model Selection
4.2 Underfitting, and Overfitting
4.3 L1 and L2 Regularization
4.4 Dropout
4.5 Challenges - Vanishing and Exploding Gradients, Covariance shift
4.6 Parameter Initialization
4.7 Batch Normalization

5 Convolutional Networks ( 4 hrs)


5.1 Convolutions for Images
5.2 Learning a Kernel
5.3 Padding and stride, Channels, Pooling
5.4 Design of CNN
5.5 Popular CNN architectures
5.6 Transfer Learning
5.7 Applications of CNN

6 Sequence Models ( 6 hrs )


6.1 Recurrent Neural Networks
6.2 Back-propagation through time
6.3 Challenge - Exploding - Vanishing gradient and Gates
6.4 Popular RNN architectures
6.5 Applications of RNNs

7 Attention Mechanism (4 hrs)


7.1 Attention Pooling
7.2 Attention Scoring Functions
7.3 Multi-Head Attention, Self-Attention and Positional Encoding
7.4 Transformer architecture
7.5 Applications of Transformers

8 Neural Network search (2 hrs)


8.1 Search Space
8.2 Search algorithms
8.3 Evaluation Strategy

9 Time series Modelling and Forecasting ( 2 hrs)


9.1 Univariate, Multivariate and Multi-step CNN Models
9.2 Univariate, Multivariate and Multi-step LSTM Models

10 Other Learning Techniques (4 hrs)


10.1 Federated learning
10.2 Meta learning
10.3 Online (incremental) learning
Learning Outcomes:

No Learning Outcomes

LO1 Able to understand the basics of Deep Learning.

LO2 Able to understand and apply techniques related to Deep Learning to


applications.

LO3 Able to identify appropriate tools to implement the solutions to problems related
to Deep Learning and implement solutions.

Part B: Learning Plan

Session
Topic Title Resource Reference
No.

1 Fundamentals of Neural Network


• Objective of the course
• Why Deep Learning?
• Applications of Deep Learning T1 – Ch1
• Biological neuron vs artificial neuron
http://mlsp.cs.cmu.edu/
• Connectionism model people/rsingh/docs/
• Perceptron Chapter1_Introduction.pdf

• Perceptron learning algorithm


• XOR Problem

2 Fundamentals of Neural Network http://mlsp.cs.cmu.edu/


people/rsingh/docs/
• Multilayer Perceptron (MLP), Chapter1_Introduction.pdf
• MLP on Boolean, reals and continuous values
• MLP as classifiers http://mlsp.cs.cmu.edu/
• MLP as Universal approximators people/rsingh/docs/
Chapter2_UniversalApproximat
• Issue of Depth and Width ors.pdf

3 Deep Feedforward Neural Network T1 – Ch4 and Ch3.4


• Forward Propagation
• MLP with hidden Layers
• Backward Propagation (review)
• Training a DNN using Gradient Descent algorithm
(review)
• Computational Graphs (review as already
discussed in MFML)

4 Deep Feedforward Neural Network


• Activation Functions T1 – Ch4 and Ch3.4
• Softmax Regression

5 Optimization algorithms for Deep models


• Challenges – Saddle points and plateau
• Non-convex optimization intuition
• Stochastic Gradient Descent (SGD), Minibatch SGD
• Overview of Rprop, Quickprop T1 – Ch11
• Momentum, Nastrov’s Accelarated Momentum
• Algorithms with Adaptive Learning Rates, Adagrad,
RMSprop, ADAM

6 Regularization for Deep models


• Model Selection, Underfitting, and Overfitting
• L1 and L2 Regularization
• Dropout
• Challenge - Vanishing and Exploding Gradients T1 – Ch4 , 7.5
• Parameter Initialization
• Challenge Covariance Shift
• Batch Normalization

7 Convolutional Neural Network


• Basics of Computer Vision and Invariance
• Convolutions for Images
• Learning a Kernel
• Padding and stride T1 – Ch6
• Channels
• Pooling
• Designing a CNN

8 Popular CNN architectures T1 – Ch7


• LeNet
• AlexNet
• VGG16
• RCNN and Fast RCNN
• Network in Network (NiN)
• Inception Net
• ResNet
• DenseNet
• Transfer Learning
• Applications of CNN

9 Sequence Models
• Recurrent Neural Networks
• Types of Sequences and RNNs
• Back-propagation Through Time (discuss the
T1 – Ch8
paper)
• Gates and Exploding / Vanishing gradient

10 Popular RNN architectures


• Gated Recurrent Units (GRU)
• Long Short-Term Memory (LSTM) Networks T1 – Ch9
(discuss why and how LSTM solves)
• Bidirectional models

11 Attention Mechanism
• Attention Pooling
• Attention Scoring Functions T1 – Ch10
• Multi-Head Attention

12 Attention Mechanism
• Self-Attention
• Positional Encoding
T1 – Ch10
• Transformer architecture
• Applications of Transformers

13 Neural Network search overview


• Search Space
R5
• Search algorithms
• Evaluation Strategy

14 Time series Modelling and Forecasting R4


• Using CNN Ch-8, Ch-9
• Using LSTM
(Trends and Seasonality will be discussed. No overview of
statistical techniques)

15 https://arxiv.org/pdf/
1902.04885.pdf
Federated learning https://arxiv.org/pdf/
1903.10635.pdf
• Federated Learning Of Out-Of-Vocabulary
https://arxiv.org/pdf/
Words Meta learning 2004.05439.pdf

16 Online (incremental) learning https://arxiv.org/pdf/


1802.07569.pdf
• Continual Lifelong Learning with Neural Networks https://arxiv.org/pdf/
1904.07734v1.pdf
• Three scenarios for continual learning

Detailed Plan for Lab work

Session
Lab No. Lab Objective Lab Sheet Access URL
Reference

1 Introduction to Tensorflow, Keras 2

2 Computational graph in Pytorch 3

Deep Neural Network with Back-propagation


3 4
and optimization

4 CNN 6

5 RNN 9

6 LSTM 10

7 Transformers 12

8 Time series forecasting 15

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy