Convolutional Neural Network: by Gagandeep Kaur

Convolutional Neural network
By Gagandeep Kaur
From Cats to CNN
 Around 1959 Hubel and Wiesel did a famous experiment on cat
 There is the cat and there was a screen in front of it and on the
screen there were lines being displayed at different locations and
in different orientations like slanted, horizontal, vertical and so on.
 There were some electrodes fitted to the cat and they were trying
to measure that which parts of brain actually respond to different
visual stimuli
 So, the outcome of the study was that, that different neurons in
brain fire to only different types of stimuli, it is not that all neurons in
brain always fire to any kind of visual stimuli.
From Cats to CNN
 So, this is essentially roughly the idea behind
convolutional neural networks starting from something
known as Neocognitron which was proposed back in
1980.
 The neocognitron, proposed by Fukushima (1980), is a
hierarchical multilayered neural network capable of
robust visual pattern recognition through learning.
 An example of the response of a neocognitron that
has been trained to recognize handwritten digits. The
input pattern is recognized correctly as '5'.
From Cats to CNN
 The modern convolutional neural networks were proposed by
Yann LeCun in 1989.
 He was interested in using them for the task of handwritten
digit recognition using backpropagation over a Convolutional
Neural Netwrok (CNN) and this was again in the context of
postal delivery services.
 So, lot of pin codes get written or phone numbers get written
on the postcards and there was a requirement to read them
automatically.
 So, that these can be separated into different categories
according to the postal code.
From Cats to CNN
 In 1998 the now famous the MNIST data
set which is used for teaching deep
neural networks courses and is still
popular
 LeNet-5 CNN Architecture

Revival of CNN (Name changed to Deep
Learning 2010)
2012-2016
Biological Neurons
Biological Neurons
Biological Neurons
Biological Neurons
Biological Neurons
Biological Neurons
Biological Neurons
Biological Neurons
MP Neuron (Macullock and Pitts
Neuron)
Convolution In Image Processing
This process is know as Padding

1D Convolution
2D Convolution
2D Convolution
7X7 3x3 5x5
2D Convolution
No padding And No strides

2D Convolution
No padding and Strides
2D Convolution
Padding and Strides
3D Convolution
 2D convolution refers *not* to the dimension of the convolution kernel but

to the dimension of the output. The output’s dimension is 2D, single
channel. A single 2D convolution pass over a 3D image uses a 3D
convolution kernel to obtain the 2D output. The convolution works like this:
for (int y = 0; y < height; y++)

for (int x = 0; x < width; x++)
temp = 0;
for (int ry = -kernelRadius; ry <= kernelRadius; ry++)
for (int rx = -kernelRadius; rx <= kernelRadius; rx++)
for (int z = 0; z < numberOfChannels; z++)
temp += image[y+ry][x+rx][z] *convolutionKernelWeights[ry][rx][z];
output[y][x] = temp;
//If a bias and activation function are used:
output[y][x] = activationFunction(output[y][x]+bias);
3D Convolution
Relation between Convolution and CNN
Image Classification: feature Engineering :
Sparse Pixels: Remove unwanted Pixels
This calculates gradient of

pixels change
For Blurring
In machine learning machine learns the weights for the classifiers but
we have hand crafted kernels for feature Engineering
In deep learning we learn the weights for these kernels as well which
are used for feature engineering
Can we learn multiple kernels?
Yes , we can learn multiple kernels of variable size
MNIST Digit Data set
CNN
CNN
CNN
This will lead to under fitting if we use
same kernel for whole image
Different Kernels to detect different
features in the same image
Example
CNN Terminology
 Convolution
 Filter/Kernel
 Stride
 Padding
 Feature Map
 Volume
 Parameter Sharing
 Local Connectivity
 Pooling (Subsampling/ Down sampling)
 Max pooling/Average Pooling
 Subsampling ratio
 Activation Function
 ReLu
 Softmax
Pooling (Subsampling/ Down sampling)
 Its function is to progressively reduce the spatial size of the representation
to reduce the amount of parameters and computation in the network.
 Pooling layer operates on each feature map independently
 Pooling is of two type:
 Average Pooling: Calculate the average value for each patch on the feature
map.
 Maximum Pooling (or Max Pooling): Calculate the maximum value for each
patch of the feature map.

MAX Pooling: To reduce the size
Pooling: enhancing the power of convolutions
convolutions
Filtering
 First of all, the image size is reduced to its half: by taking groups of 2x2 pixels and only
retaining the maximum, now the image is smaller.
 The edges we kept when applying the convolution with the edge filter not only are
maintained, but are also intensified.
 This means we have been able to reduce the information the image contains (by
keeping only half of the pixels), but still kept and intensified the useful features the filters
show when convolving the image.
No Parameters to learn
CNN
CNNs operate over Volumes !
➢ Input is RGB Image

➢ Unlike neural networks, where the input is a vector,
here the input is a multi-channeled image (3
channeled in this case)
Convolution Operation with a filter
Output of Convolution Operation with a filter
CNN Layers: Convolutional Layer
 The convolution layer is the main building block of a convolutional neural network.
 The convolution layer comprises of a set of independent filters (6 in the example shown)
 Each filter is independently convolved with the image and we end up with 6 feature maps of
shape 28*28*1
 All these filters are initialized randomly and become our parameters which will be learned by
the network subsequently
Pooling Layers
Layer Dimensions
Output dimensions and number of parameters
of LeNet5
Summary:

Convolutional Neural Network: by Gagandeep Kaur

Uploaded by

Copyright:

Available Formats

Convolutional Neural Network: by Gagandeep Kaur

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Convolutional Neural Network: by Gagandeep Kaur

Uploaded by

Copyright:

Available Formats

Convolutional Neural network

 In 1998 the now famous the MNIST data

set which is used for teaching deep

neural networks courses and is still

 LeNet-5 CNN Architecture

This process is know as Padding

No padding And No strides

 2D convolution refers not to the dimension of the convolution kernel but

for (int y = 0; y < height; y++)

This calculates gradient of

 Its function is to progressively reduce the spatial size of the representation

to reduce the amount of parameters and computation in the network.

 Pooling layer operates on each feature map independently

 Pooling is of two type:

patch of the feature map.

➢ Input is RGB Image

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.