Unit 5

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

What is CNN?

Convolutional Neural Networks (CNN, or ConvNet) are a type of multi-layer neural network
that is meant to discern visual patterns from pixel images. In CNN, ‘convolution’ is referred
to as the mathematical function. It’s a type of linear operation in which you can multiply
two functions to create a third function that expresses how one function’s shape can be
changed by the other.

The ConvNet’s job is to compress the images into a format that is easier to process while
preserving elements that are important for obtaining a decent prediction. This is critical for
designing an architecture that is capable of learning features while also being scalable to
large datasets. ConvNets in short has three layers which are its building blocks, let’s have a
look:

1.Convolution Layers:-
 This layer is the first layer that is used to extract the various features from the input
images. In this layer, the mathematical operation of convolution is performed
between the input image and a filter of a particular size MxM. By sliding the filter
over the input image, the dot product is taken between the filter and the parts of the
input image with respect to the size of the filter (MxM).
 The output is termed as the Feature map which gives us information about the
image such as the corners and edges. Later, this feature map is fed to other layers to
learn several other features of the input image.
 The convolution layer in CNN passes the result to the next layer once applying the
convolution operation in the input. Convolutional layers in CNN benefit a lot as they
ensure the spatial relationship between the pixels is intact.
Pooling Layer (POOL): This layer is in charge of reducing dimensionality. It
aids in reducing the amount of computing power required to process the data.
Pooling can be divided into two types: maximum pooling and average pooling.
The maximum value from the area covered by the kernel on the image is
returned by max pooling. The average of all the values in the part of the image
covered by the kernel is returned by average pooling.

Fully Connected Layer:-


 The Fully Connected (FC) layer consists of the weights and biases along with the
neurons and is used to connect the neurons between two different layers. These layers
are usually placed before the output layer and form the last few layers of a CNN
Architecture.
 In this, the input image from the previous layers are flattened and fed to the FC layer.
The flattened vector then undergoes few more FC layers where the mathematical
functions operations usually take place. In this stage, the classification process begins
to take place. The reason two layers are connected is that two fully connected layers
will perform better than a single connected layer. These layers in CNN reduce the
human supervision
Activation Functions:-
 Finally, one of the most important parameters of the CNN model is the activation
function. They are used to learn and approximate any kind of continuous and complex
relationship between variables of the network. In simple words, it decides which
information of the model should fire in the forward direction and which ones should
not at the end of the network.It adds non-linearity to the network

. Dropout/Output:-
 Usually, when all the features are connected to the FC layer, it can cause overfitting in
the training dataset. Overfitting occurs when a particular model works so well on the
training data causing a negative impact in the model’s performance when used on a
new data.
 To overcome this problem, a dropout layer is utilised wherein a few neurons are
dropped from the neural network during training process resulting in reduced size of
the model. On passing a dropout of 0.3, 30% of the nodes are dropped out randomly
from the neural network.
 Dropout results in improving the performance of a machine learning model as it
prevents overfitting by making the network simpler. It drops neurons from the neural
networks during training.

Recurrent Neural Network(RNN) :


is a type of Neural Network where the output from the previous step are
fed as input to the current step. In traditional neural networks, all the
inputs and outputs are
independent of each other,
but in cases like when it is
required to predict the next
word of a sentence, the
previous words are required
and hence there is a need
to remember the previous
words. Thus RNN came into
existence, which solved this
issue with the help of a
Hidden Layer. The main and
most important feature of
RNN is Hidden state, which
remembers some
information about a
sequence.

RNN have a “memory” which remembers all information about what has
been calculated. It uses the same parameters for each input as it performs
the same task on all the inputs or hidden layers to produce the output. This
reduces the complexity of parameters, unlike other neural networks.

How RNN works

The working of an RNN can be understood with the help of the below
example:
Example: Suppose
there is a deeper
network with one
input layer, three
hidden layers, and
one output layer.
Then like other neural
networks, each
hidden layer will have
its own set of weights
and biases, let’s say,
for hidden layer 1 the
weights and biases
are (w1, b1), (w2, b2)
for the second hidden
layer, and (w3, b3) for
the third hidden layer.
This means that each
of these layers is
independent of the other, i.e. they do not memorize the previous outputs.

Now the RNN will do the following:


 RNN converts the independent activations into dependent
activations by providing the same weights and biases to all the
layers, thus reducing the complexity of increasing parameters and
memorizing each previous output by giving each output as input to
the next hidden layer.
 Hence these three layers can be joined together such that the
weights and bias of all the hidden layers are the same, in a single
recurrent layer.
The formula for calculating current state:

where:
h -> current state
t

h -> previous state


t-1

x -> input state


t

Formula for applying Activation function(tanh):

where:
w -> weight at recurrent neuron
hh

w -> weight at input neuron


xh

The formula for calculating output:

Training through RNN

1. A single-time step of the input is provided to the network.


2. Then calculate its current state using a set of current input and the
previous state.
3. The current ht becomes ht-1 for the next time step.
4. One can go as many time steps according to the problem and join
the information from all the previous states.
5. Once all the time steps are completed the final current state is used
to calculate the output.
6. The output is then compared to the actual output i.e the target
output and the error is generated.
7. The error is then back-propagated to the network to update the
weights and hence the network (RNN) is trained.

Advantages of Recurrent Neural Network


1. An RNN remembers each and every piece of information through
time. It is useful in time series prediction only because of the
feature to remember previous inputs as well. This is called Long
Short Term Memory.
2. Recurrent neural networks are even used with convolutional layers
to extend the effective pixel neighborhood.
Disadvantages of Recurrent Neural Network
1. Gradient vanishing and exploding problems.
2. Training an RNN is a very difficult task.
3. It cannot process very long sequences if using tanh or relu as an
activation function.
Applications of Recurrent Neural Network
1. Language Modelling and Generating Text
2. Speech Recognition
3. Machine Translation
4. Image Recognition, Face detection
5. Time series Forecasting

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy