Unit 5
Unit 5
Unit 5
Convolutional Neural Networks (CNN, or ConvNet) are a type of multi-layer neural network
that is meant to discern visual patterns from pixel images. In CNN, ‘convolution’ is referred
to as the mathematical function. It’s a type of linear operation in which you can multiply
two functions to create a third function that expresses how one function’s shape can be
changed by the other.
The ConvNet’s job is to compress the images into a format that is easier to process while
preserving elements that are important for obtaining a decent prediction. This is critical for
designing an architecture that is capable of learning features while also being scalable to
large datasets. ConvNets in short has three layers which are its building blocks, let’s have a
look:
1.Convolution Layers:-
This layer is the first layer that is used to extract the various features from the input
images. In this layer, the mathematical operation of convolution is performed
between the input image and a filter of a particular size MxM. By sliding the filter
over the input image, the dot product is taken between the filter and the parts of the
input image with respect to the size of the filter (MxM).
The output is termed as the Feature map which gives us information about the
image such as the corners and edges. Later, this feature map is fed to other layers to
learn several other features of the input image.
The convolution layer in CNN passes the result to the next layer once applying the
convolution operation in the input. Convolutional layers in CNN benefit a lot as they
ensure the spatial relationship between the pixels is intact.
Pooling Layer (POOL): This layer is in charge of reducing dimensionality. It
aids in reducing the amount of computing power required to process the data.
Pooling can be divided into two types: maximum pooling and average pooling.
The maximum value from the area covered by the kernel on the image is
returned by max pooling. The average of all the values in the part of the image
covered by the kernel is returned by average pooling.
. Dropout/Output:-
Usually, when all the features are connected to the FC layer, it can cause overfitting in
the training dataset. Overfitting occurs when a particular model works so well on the
training data causing a negative impact in the model’s performance when used on a
new data.
To overcome this problem, a dropout layer is utilised wherein a few neurons are
dropped from the neural network during training process resulting in reduced size of
the model. On passing a dropout of 0.3, 30% of the nodes are dropped out randomly
from the neural network.
Dropout results in improving the performance of a machine learning model as it
prevents overfitting by making the network simpler. It drops neurons from the neural
networks during training.
RNN have a “memory” which remembers all information about what has
been calculated. It uses the same parameters for each input as it performs
the same task on all the inputs or hidden layers to produce the output. This
reduces the complexity of parameters, unlike other neural networks.
The working of an RNN can be understood with the help of the below
example:
Example: Suppose
there is a deeper
network with one
input layer, three
hidden layers, and
one output layer.
Then like other neural
networks, each
hidden layer will have
its own set of weights
and biases, let’s say,
for hidden layer 1 the
weights and biases
are (w1, b1), (w2, b2)
for the second hidden
layer, and (w3, b3) for
the third hidden layer.
This means that each
of these layers is
independent of the other, i.e. they do not memorize the previous outputs.
where:
h -> current state
t
where:
w -> weight at recurrent neuron
hh