Chapter 4 Ann

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 33

Deep Learning Application for Communication

Engineering
Subject Code: ECE7419

By

Dr. RAM SEWAK SINGH


Associate Professor

Electronics and Communication Engineering Department


School of Electrical Engineering and Computing
Adama Science and Technology University,
Ethiopia, Po. Box: 1888 1
ASTU
Chapter 4: : 2

This chapter introduces the convolutional neural network (ConvNet) or (CNN),


which is a deep neural network specialized for image recognition. This technique
exemplifies how significant the improvement of the deep layers is for information
(images) processing.

ConvNet is not just a deep neural network (special type of feed forward neural
network) that has many hidden layers. It is a deep network that imitates how the
visual cortex of the brain processes and recognizes images..

ASTU
3

ASTU
Basically, image recognition is the classification. For example, recognizing
whether the image of a picture is a cat or a dog is the same as classifying the image
into a cat or dog class.

The same thing applies to the letter recognition; recognizing the letter from an
image is the same as classifying the image into one of the letter classes. Therefore,
the output layer of the ConvNet generally employs the multiclass classification
neural network.

Weight Initialization

Weight Initialization ,the way that the weights of neural net-works will be initialized
is very important, and it can determine whether the algorithm converges at all, with
some initial points being so unstable that the algorithm encounters numerical
difficulties and fails altogether .

Most of the time, the weights are initialized randomly from a Gaussian or uniform
distribution. Balances between all the layers to have the same activation variance
and the same gradient variance. More formally the weights are initialized as:
Concepts of Convolutional Neural Network (CNN)

 A deep CNN model consists of a


finite set of processing layers that can
learn various features of input data
(e.g. image) with multiple level of
abstraction .

 The initiatory layers learn and


extract the high level features (with
lower abstraction), and the deeper
layers learns and ex- tracts the low
level features (with higher
abstraction). The basic conceptual
model of CNN was shown in below
Figure, different types of layers
described in subsequent sections.

Figure : Conceptual model of CNN


Network Layers
A CNN is composed of multiple building blocks (known as layers of the
architecture),
 Convolutional Layer

Convolutional layer1 is the most important component of any CNN architecture. It


contains a set of convolutional kernels (also called Filters), which gets convolved
with the input image (N-dimensional metrics) to generate an output feature map.

What is a kernel? A kernel can be


described as a grid of discrete values or
numbers, where each value is known as
the weight of this kernel. During the
starting of training process of an CNN
model, all the weights of a kernel are
assigned with random numbers
(different approaches are also available
there for initializing the weights).
(Kernel= Filter= Feature Detection)
What is Convolution Operation?
Unlike other classical neural networks (where the input is in a vector format), in
CNN the input is a multi-channeled image (e.g. for RGB image as in Figure 4, it is 3
channeled and for Gray-Scale image, it is single channeled).
Padding (Border Problem Solver)
Padding to the input image and with stride (i.e. the taken step size along the
horizontal or vertical position ). We can use other stride value (rather than 1) in
convolution operation. if we increase the stride of the convolution operation, it
resulted in lower-dimensional feature map.

The padding is important to give border size information of the input image more
importance, otherwise without using any padding the border side features are gets
washed away too quickly.
The Fig.9 gives an example by showing the convolution operation with Zero-padding
and 3 stride value.

Fig.9
Pooling Layer
The pooling2 layers are used to sub-sample the feature maps (produced after
convolution operations), i.e. it takes the larger size feature maps and shrinks them
to lower sized feature maps.

While shrinking the feature maps it always preserve the most dominant features (or
information) in each pool steps. The pooling operation is performed by specifying
the pooled region size and the stride of the operation, similar to convolution
operation.
There are different types of pooling techniques are used in different pooling layers
such as:
 Max pooling, min pooling, average pooling, gated pooling, tree pooling, etc. Max
Pooling is the most popular and mostly used pooling technique.
The main drawback of pooling layer is that it sometimes decreases the overall
performance of CNN. The reason behind this is that pooling layer helps CNN to
find whether a specific feature is present in the given input image or not without
caring about the correct position of that feature.
The whole CNN
cat dog ……
Convolution

Max Pooling

Fully Connected A new


Feedforward network
Convolutionimage

Max Pooling

Flattene A new
d image
Examples:
Divided Pixels (5x5) Divided level (value)(5x5
Image:

23 ASTU
Examples conti…

Convolution:

Feature map for Filter 1

24 ASTU
Feature map for Filter 1

Feature map for Filter 2

Feature map for Filter 3

Problem with stride 1 (move) [ repeating the same pixel more times so it reduce
the in formation:

25 ASTU
Examples: Contin…

Without Padding:

26 ASTU
Examples: contin..

Max

Features
images
after
pooling

27 ASTU
Pooling features image converted to column (Flattened) and will apply to
Classification network.

28 ASTU
Add the maximum value in training output and add the value of new input corresponding to max
value of training out put. Find the percentage (If percentage is > 90% meanse the new input is close
to train input.

29 ASTU
Example: Lastly, let’s investigate how the image is processed while it passes through
the convolution layer and pooling layer. The original dimension of the MNIST image
is 28 x28. Once the image is processed with the 9x9 convolution filter, it becomes a
20x20 feature map. As we have 20 convolution filters, the layer produces 20 feature
maps. Through the 2x2 mean pooling process, the pooling layer shrinks each feature
map to a 10x10 map.
There are various architectures of CNNs available which have been key in
building algorithms which power and shall power AI as a whole in the
foreseeable future. Some of them have been listed below:
LeNet
AlexNet
VGGNet
GoogLeNet
ResNet
ZFNet
33

Thank you for your attention!!

ASTU

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy