0% found this document useful (0 votes)
11 views

Cnn

convolution neural network

Uploaded by

Electrical Tech
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Cnn

convolution neural network

Uploaded by

Electrical Tech
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

Introduction to Deep Learning

Nandita Bhaskhar
Content adapted from CS231n and past CS229 teams
April 29th, 2022
Overview
● Motivation for deep learning
● Areas of Deep Learning
● Convolutional neural networks
● Recurrent neural networks
● Deep learning tools

2
Classical Approaches
Saturate!
● Computer vision is especially hard for
conventional image processing
techniques
● Humans are just intrinsically better at
perceiving the world!

https://xkcd.com/1425/
3
What about the MLPs we learnt in class?

Recall:
● Input Layer
● Hidden layer
● Activations
● Outputs

Pic Credit: Becoming Human: Artificial Intelligence Magazine 4


What about the MLPs we learnt in class?
Expensive to learn. Will not generalize well

Does not exploit the order and local relations in the data!

64x64x3=12288 parameters
We also want many layers
5
Overview
● Motivation for deep learning
● Areas in Deep Learning
● Convolutional neural networks
● Recurrent neural networks
● Deep learning tools

6
What are different pillars of deep learning?

Convolutional NN Recurrent NN
Image Time Series

Graph NN
Networks/Relational
Deep RL
Control System

7
Overview
● Motivation for deep learning
● Areas of Deep Learning
● Convolutional neural networks
● Recurrent neural networks
● Deep learning tools

8
Convolutional Neural Networks

Convolutional
Neural Network

Recurrent NN Deep RL Graph NN


9
Let us look at images in detail

10
2D Convolution

11
Pic Credit: Apple, Chip Huyen
Sharpening

Convolving Filters
https://ai.stanford.edu/~syyeung/cvweb/tutorials.html

Edge Detection: Laplacian Filters

0 -1 0 -1 -1 -1

-1 4 -1 -1 8 -1

0 -1 0 -1 -1 -1

12
https://ai.stanford.edu/~syyeung/cvweb/tutorials.html
Convolving Filters
● Why not extract features using
filters?

● Better, why not let the data dictate


what filters to use?

● Learnable filters!!

13
Convolution on multiple channels
● Images are generally RGB !!

● How would a filter work on a


image with RGB channels?

● The filter should also have 3


channels.

● Now the output has a channel


for every filter we have used.

14
Slide Credit: CS231n 15
Slide Credit: CS231n 16
Slide Credit: CS231n 17
Slide Credit: CS231n 18
Slide Credit: CS231n 19
Slide Credit: CS231n 20
21
Slide Credit: CS231n
22
Slide Credit: CS231n
Slide Credit: CS231n 23
Slide Credit: CS231n 24
Slide Credit: CS231n 25
Slide Credit: CS231n 26
Slide Credit: CS231n 27
Slide Credit: CS231n 28
Slide Credit: CS231n 29
Parameter Sharing

Lesser the parameters less computationally intensive the training. This is a


win win as we are reusing parameters.

30
Translational invariance
Since we are training filters to
detect cats and the moving
these filters over the data, a
differently positioned cat will
also get detected by the same
set of filters.

31
Filteres? Layers of filters?

Images that maximize filter outputs at certain How deeper layers can learn deeper
layers. We observe that the images get more embeddings. How an eye is made up of multiple
complex as filters are situated deeper curves and a face is made up of two eyes.
32
How do we use convolutions?

Let convolutions extract features!


33
Image credit: LeCun et al. (1998)
Fun Fact: Convolution really is just a linear operation
● In fact convolution is a giant matrix
multiplication.

● We can expand the 2 dimensional


image into a vector and the conv
operation into a matrix.

34
How do we learn?

We now have a network with:


● a bunch of weights
● a loss function

To learn:
● Just do gradient descent and backpropagate the error derivates

35
How do we learn?
Instead of

There are “optimizers”

● Momentum: Gradient + Momentum


● Nestrov: Momentum + Gradients
● Adagrad: Normalize with sum of sq
● RMSprop: Normalize with moving
avg of sum of squares
● ADAM: RMsprop + momentum

36
Mini-batch Gradient Descent
Expensive to compute gradient for large dataset

Memory size

Compute time

Mini-batch: takes a sample of training data

How to we sample intelligently?

37
Is deeper better?
Deeper networks seem to be
more powerful but harder to train.

● Loss of information during


forward propagation
● Loss of gradient info during
back propagation

There are many ways to “keep


the gradient going”

38
Solution
Connect the layers, create a gradient highway or information

highway.

ResNet (2015)
39
Image credit: He et al. (2015)
Initialization
● Can we initialize all neurons to zero? ● Relu units once knocked out and
their output is zero, their gradient
● If all the weights are same we will not flow also becomes zero.
be able to break symmetry of the
network and all filters will end up ● We need small random numbers at
learning the same thing. initialization.

● Large numbers, might knock relu ● Variance : 1/sqrt(n)


units out. ● Mean: 0

Popular initialization setups

(Xavier, He) (Uniform, Normal) 40


Dropout
● What does cutting off some network
connections do?

● Trains multiple smaller networks in


an ensemble.

● Can drop entire layer too!

● Acts like a really good regularizer

41
Tricks for training
● Data augmentation if your data set is
smaller. This helps the network
generalize more.

● Early stopping if training loss goes


above validation loss.

● Random hyperparameter search or


grid search?

42
Overview
● Motivation for deep learning
● Areas of Deep Learning
● Convolutional neural networks
● Recurrent neural networks
● Deep learning tools

43
CNN sounds like fun!
What are some deep learning pillars?

Recurrent NN
Time Series

Convolutional NN Deep RL Graph NN


44
We can also have 1D architectures (remember this)
● CNN works on any data where there is
a local pattern

● We use 1D convolutions on DNA


sequences, text sequences and music
notes

● But what if time series has causal


dependency or any kind of sequential
dependency?

45
To address sequential dependency?
Use recurrent neural network (RNN)
Latent Output Unrolling an RNN
Previous output

One time step


RNN Cell

They are really the same cell,


NOT many different cells like kernels of CNN
46
How does RNN produce result?
Evolving “embedding”

Result after reading


full sentence

I love CS ! 47
There are 2 types of RNN cells
Store in “long term memory” Response to current input Reset gate Update gate

Response to
current input

Long Short Term Memory (LSTM) Gated Recurrent Unit (GRU)


48
Recurrent AND deep?
Taking last value
Pay “attention” to
everything

Stacking Attention Model 49


“Recurrent” AND convolutional?

Temporal convolutional network

Temporal dependency achieved through


“one-sided” convolution

More efficient because deep learning


packages are optimized for matrix
multiplication = convolution

No hard dependency

50
More? Take CS230, CS236, CS231N, CS224N

Convolutional NN Recurrent NN
Image Time Series

Graph NN
Networks/Relational
Deep RL
Control System

51
Not today, but take CS234 and CS224W

Convolutional NN Recurrent NN
Image Time Series

Graph NN
Networks/Relational
Deep RL
Control System

52
Overview
● Motivation for deep learning
● Areas of Deep Learning
● Convolutional neural networks
● Recurrent neural networks
● Deep learning tools

53
Tools for deep learning Specialized
Groups

Popular Tools
54
Where can I get free stuff?
Google Colab
Azure Notebook
Free (limited-ish) GPU access
Kaggle kernel???
Works nicely with Tensorflow
Amazon SageMaker?
Links to Google Drive

Register a new Google Cloud account To SAVE money

=> Instant $300??


CLOSE your GPU instance
=> AWS free tier (limited compute)

=> Azure education account, $200? ~$1 an hour


55
Good luck!
Well, have fun too :D

56

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy