Intro To CNN
Intro To CNN
• Why CNN?
• Deep neural networks
– Increased no of parameters
– Overfitting
• Can we reduce number of parameters?
– CNN
Difference between FFNN and CNN
FFNN CNN
Dense Connections Sparse connections
No. of learnable parameters - No. of learnable parameters -
High Low
Learnable parameters
MNIST dataset
• FFNN
• 1st layer :
– (256 hidden units and input size=28×28)
• 2nd layer:
– (10 units at 2nd layer)
Learnable parameters
MNIST dataset
• FFNN
• 1st layer :
– (256 hidden units and input size=28×28)
– (784*256)+256=200960
• 2nd layer:
– (10 units at 2nd layer)
– (256*10)+10=2570
CNN: Convolutional layer
a: https://en.wikipedia.org/wiki/Convolution
CNN: Convolutional layer
• Filter or kernel
– Extract features from images
– Moves over the input data from left to right top to bottom,
computes dot product with sub-region of input data
• Activation map or feature map
– Output volume formed by sliding the filter over the
image
• Depth
– Number of filters
• Stride
– Main objective is to produce smaller output
volumes,
– If s=2, the filter will shift by 2 pixels as it
convolves around input volume.
– Size of the image keeps on reducing as we increase
the stride value.
• Padding
– Padding the input image with zero across it.
CNN
• Input layer
• Convolutional layer
• Fully connected layer
• Output layer
Convolutional layer
Consider the following 6x6 image and the following 3x3 filter
3 0 1 2 7 4
1 5 8 9 3 1 1 0 -1
2 7 2 5 1 3 1 0 -1
0 1 3 1 7 8
1 0 -1
4 2 1 6 2 8
2 4 5 2 3 9 KERNEL
INPUT
Convolutional layer
3 0 1 2 7 14 0 -1 3 0 -1
1 5 8 9* 3 11 0 -1 = 1 0 -8
2 7 2 5 1 13 0 -1 2 0 -2
0 1 3 1 7 8
4 2 1 6 2 8
2 4 5 =>23+0+(-1)+1+0+(-8)+2+0+(-2)=
3 9 -5
Convolutional layer
Consider the following 6x6 image and the following 3x3 filter
3 0 1 2 7 4
1 5 8 9 3 1 1 0 -1
2 7 2 5 1 3 1 0 -1
0 1 3 1 7 8
1 0 -1
4 2 1 6 2 8
2 4 5 2 3 9
0 1 2 1 0 -1 0 0 -2
5 8 9 * 1 0 -1 = 5 0 -9
7 2 5 1 0 -1 7 0 -5
=> 0+0+(-2)+5+0+(-9)+7+0+(-5)= -4
Similarly after applying the filter to the complete image, we obtain the following results in
the form of a 4x4 matrix
5 -4 0 8
-10 -2 2 3
0 -2 -4 -7
-3 -2 -3 -16
Computing output Volume [W2×H2×D2]