Convolutional Neural Network: by Gagandeep Kaur

Download as pdf or txt
Download as pdf or txt
You are on page 1of 107

Convolutional Neural network

By Gagandeep Kaur
From Cats to CNN
 Around 1959 Hubel and Wiesel did a famous experiment on cat
 There is the cat and there was a screen in front of it and on the
screen there were lines being displayed at different locations and
in different orientations like slanted, horizontal, vertical and so on.
 There were some electrodes fitted to the cat and they were trying
to measure that which parts of brain actually respond to different
visual stimuli
 So, the outcome of the study was that, that different neurons in
brain fire to only different types of stimuli, it is not that all neurons in
brain always fire to any kind of visual stimuli.
From Cats to CNN
 So, this is essentially roughly the idea behind
convolutional neural networks starting from something
known as Neocognitron which was proposed back in
1980.
 The neocognitron, proposed by Fukushima (1980), is a
hierarchical multilayered neural network capable of
robust visual pattern recognition through learning.
 An example of the response of a neocognitron that
has been trained to recognize handwritten digits. The
input pattern is recognized correctly as '5'.
From Cats to CNN
 The modern convolutional neural networks were proposed by
Yann LeCun in 1989.
 He was interested in using them for the task of handwritten
digit recognition using backpropagation over a Convolutional
Neural Netwrok (CNN) and this was again in the context of
postal delivery services.
 So, lot of pin codes get written or phone numbers get written
on the postcards and there was a requirement to read them
automatically.
 So, that these can be separated into different categories
according to the postal code.
From Cats to CNN

 In 1998 the now famous the MNIST data

set which is used for teaching deep

neural networks courses and is still

popular

 LeNet-5 CNN Architecture


Revival of CNN (Name changed to Deep
Learning 2010)

2012-2016
Biological Neurons
Biological Neurons
Biological Neurons
Biological Neurons
Biological Neurons
Biological Neurons
Biological Neurons
Biological Neurons
MP Neuron (Macullock and Pitts
Neuron)
Convolution In Image Processing
Convolution In Image Processing
Convolution In Image Processing
Convolution In Image Processing
Convolution In Image Processing

This process is know as Padding


Convolution In Image Processing
Convolution In Image Processing
1D Convolution
2D Convolution
2D Convolution
7X7 3x3 5x5
2D Convolution

No padding And No strides


2D Convolution
No padding and Strides
2D Convolution
Padding and Strides
3D Convolution

 2D convolution refers *not* to the dimension of the convolution kernel but


to the dimension of the output. The output’s dimension is 2D, single
channel. A single 2D convolution pass over a 3D image uses a 3D
convolution kernel to obtain the 2D output. The convolution works like this:

for (int y = 0; y < height; y++)


for (int x = 0; x < width; x++)
temp = 0;
for (int ry = -kernelRadius; ry <= kernelRadius; ry++)
for (int rx = -kernelRadius; rx <= kernelRadius; rx++)
for (int z = 0; z < numberOfChannels; z++)
temp += image[y+ry][x+rx][z] *convolutionKernelWeights[ry][rx][z];
output[y][x] = temp;
//If a bias and activation function are used:
output[y][x] = activationFunction(output[y][x]+bias);
3D Convolution
Relation between Convolution and CNN
Image Classification: feature Engineering :
Sparse Pixels: Remove unwanted Pixels

This calculates gradient of


pixels change
For Blurring
In machine learning machine learns the weights for the classifiers but
we have hand crafted kernels for feature Engineering
In deep learning we learn the weights for these kernels as well which
are used for feature engineering
Can we learn multiple kernels?
Yes , we can learn multiple kernels of variable size
MNIST Digit Data set
CNN
CNN
CNN
This will lead to under fitting if we use
same kernel for whole image
Different Kernels to detect different
features in the same image
Example
CNN Terminology
 Convolution
 Filter/Kernel
 Stride
 Padding
 Feature Map
 Volume
 Parameter Sharing
 Local Connectivity
 Pooling (Subsampling/ Down sampling)
 Max pooling/Average Pooling
 Subsampling ratio
 Activation Function
 ReLu
 Softmax
Pooling (Subsampling/ Down sampling)

 Its function is to progressively reduce the spatial size of the representation

to reduce the amount of parameters and computation in the network.

 Pooling layer operates on each feature map independently

 Pooling is of two type:

 Average Pooling: Calculate the average value for each patch on the feature

map.

 Maximum Pooling (or Max Pooling): Calculate the maximum value for each

patch of the feature map.


MAX Pooling: To reduce the size
Pooling: enhancing the power of convolutions

convolutions

Filtering
Pooling: enhancing the power of convolutions
Pooling: enhancing the power of convolutions

 First of all, the image size is reduced to its half: by taking groups of 2x2 pixels and only
retaining the maximum, now the image is smaller.
 The edges we kept when applying the convolution with the edge filter not only are
maintained, but are also intensified.
 This means we have been able to reduce the information the image contains (by
keeping only half of the pixels), but still kept and intensified the useful features the filters
show when convolving the image.
No Parameters to learn
CNN
CNNs operate over Volumes !

➢ Input is RGB Image


➢ Unlike neural networks, where the input is a vector,
here the input is a multi-channeled image (3
channeled in this case)
Convolution Operation with a filter
Output of Convolution Operation with a filter
CNN Layers: Convolutional Layer
 The convolution layer is the main building block of a convolutional neural network.
 The convolution layer comprises of a set of independent filters (6 in the example shown)
 Each filter is independently convolved with the image and we end up with 6 feature maps of
shape 28*28*1
 All these filters are initialized randomly and become our parameters which will be learned by
the network subsequently
Pooling Layers
Layer Dimensions
Output dimensions and number of parameters
of LeNet5
Summary:

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy