0% found this document useful (0 votes)
25 views

Lecture8,9-Neural Networks

Uploaded by

somansh grover
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Lecture8,9-Neural Networks

Uploaded by

somansh grover
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

Neural

Networks
• Most studied /researched topic

• have many real life applications

• E.g. Google Translate, Google Assistant


Neural Network

Neuron
Human Brain

• Human brain has a very high number of neuron.

• All neurons are interconnected in a network.

• This connectivity help them flow information and data, and


process it and generates output.

• The human brain can learn from experience and train data
itself.

• The capacity of learning and training make human brain a very


important organ
Human Brain

• Makes human intelligent

• Responsible for building our perceptions based on our


experience through sense organs

• E.g. brain of newborn- in just few days, baby start recognising


faces and voice

• over a period they begin associating sounds with objects., e.g.


barking sound of a dog.

• They learn new things through sight and observations


We want to build such
machines in artificial neural
network which works in
same manner as human
brain.
Artificial Neural Network

• inspired by the biological neurons within the human body


which activate under certain circumstances resulting in a
related action performed by the body in response.

• consist of various layers of interconnected artificial neurons


powered by activation functions that help in switching them
ON/OFF.

• Like traditional machine algorithms , here too, there are


certain values that neural nets learn in the training phase.
Artificial Neural Network

• i s a c o m p u t ati o nal l e a r ni n g s ys t e m t h a t u s e s a n e t wor k o f f u n c t i o n s t o


u n d e rs t and a n d t r a n s l ate a d a t a i n p u t o f o n e f o r m i n t o a d e s i red
o u t p u t , u s u a l l y i n a n o t h er f o r m . T h e c o n c e pt o f t h e a r t i f i ci al n e u r a l
n e t wo rk w a s i n s p i r ed b y h u m a n b i o l o g y a n d t h e w a y n e u r o n s o f t h e
h u m a n b r a i n f u n c t i o n t o g e ther t o u n d er s tan d i n p u t s f r o m h u m a n
senses.

• N e u r al n e t wo rks a r e j u s t o n e o f m a n y t o o l s a n d a p pr oac hes u s e d


i n m a c h i n e l e a r ni n g a l g o r i th ms . T h e n e u r a l n e t wo rk i t s e l f m a y b e u s e d
a s a p i e ce i n m a n y d i f f eren t m a c h i n e l e a r n in g a l g o ri t hm s t o p r o c es s
c o m p l ex d a t a i n p u t s i n t o a s p a c e t h a t c o m p u ter s c a n u n d e rs tan d.

• N e u r al n e t wo rks a r e b e i n g a p pl ied t o m a n y r e a l -l i fe p r o b l ems t o d a y,


i n c l u d i n g s p e e ch a n d i m a g e r e c o gni t io n , s p a m e m a i l f i l t er i ng, f i n a n c e,
a n d m e d i c al d i a g no s i s , t o n a m e a f e w.
Components of Neural Network

W e i gh t s a r e n u m e r i c v a l u es t h a t a r e m u l t i p l ied b y i n p u t s . I n b a c kp ropagati o n ,
t h e y a r e m o d i f i ed t o r e du ce t h e l o s s . I n s i m p l e w o r d s , w e i gh ts a r e m a c h i n e
l e a r ned v a l u es f r o m N e u r al N e two rks . T h e y s e l f - adj us t d e pendi n g o n t h e d i f f erence
b e tween p r e di cted o u t p u ts v s t r a i n in g i n p u t s .
A c t i v ation Fu n c t i o n i s a m a t h em ati cal f o r m u l a t h a t h e l p s t h e n e u r o n t o s w i t c h
O N / O FF.

•I n p u t l a y er r e p resen ts d i m e n s i o ns o f t h e i n p u t v e c to r.

•Hi d d en l a y e r r e p res ent s t h e i n t e r medi ary n o d e s t h a t d i v i de t h e i n p u t sp a c e i n t o


r e g i on s w i t h ( s o f t ) b o u n d aries . I t t a ke s i n a s e t o f w e i gh ted i n p u t a n d p r o d uces
o u t p u t t h r o u g h a n a c t i v ati o n f u n c t i o n .

•O u t p u t l a y er r e p resen ts t h e o u t p u t o f t h e n e u r al n e t w or k.
Single hidden layer Neural Network
Two hidden layer Neural Network
Importance of Neural Network

Without Neural Network: Let's have a look at the example given below. Here we have a
machine, such that we have trained it with four types of cats, as you can see in the
image below. And once we are done with the training, we will provide a random image
to that particular machine that has a cat. Since this cat is not similar to the cats
through which we have trained our system and the background is also changed, so
without the neural network, our machine would not identify the cat in the picture.
Basically, the machine will get confused in figuring out where the cat is.

With Neural Network: However, when we talk about the case with a neural network,
even if we have not trained our machine with that particular cat. But still, it can identify
certain features of a cat that we have trained on, and it can match those features with
the cat that is there in that particular image and can also identify the cat. So, with the
help of this example, you can clearly see the importance of the concept of a neural
network.
Working of Artificial Neural Network

• each neuron receives a multiplied version of inputs and random


weights, which is then added with a static bias value (unique to
each neuron layer); this is then passed to an appropriate
activation function which decides the final value to be given out
of the neuron.

• There are various activation functions available as per the


nature of input values. Once the output is generated from the
final neural net layer, loss function (input vs output)is
calculated, and backpropagation is performed where the
weights are adjusted to make the loss minimum. Finding optimal
values of weights is what the overall operation focuses around.
Working of Artificial Neural Network

Perceptron
Working of Artificial Neural Networks

I n ste ad of dire c tly ge tti ng in to th e wor kin g of Ar ti fici al N eur al Ne two rk s, le ts


bre akdo wn and try to u nderstan d Neu r al N e twor k' s b asic uni t, w hich i s called
a P e r ce p t r on .

S o , a p er ce p tr on c an be defin ed a s a n eu r al ne two rk wi th a sin gle l ay er th at


cl assifi e s th e line ar d ata . I t f ur ther co n sti tu te s f ou r m ajo r com pon ents, w hich
ar e as f o l l ow s ;

1 .I n p u ts

2 .W e i g hts an d Bi as

3 .S u m m atio n Fu n c ti ons

4 .A c ti vati on o r tr an sf o r m atio n f u n c ti on
The main logic behind the concept of Perceptron is as follows:

The inputs (x) are fed into the input layer, which undergoes
multiplication with the allotted weights (w) followed by
experiencing addition in order to form weighted sums. Then these
inputs weighted sums with their corresponding weights are
executed on the pertinent activation function.
Weights and Bias

As and wh en th e i nput variabl e is fed i nto th e ne tw or k, a r and om val ue is


given as a wei ght of that par ticular input, such that each individual weight
r epr es ents th e im por tance of th at input in ord er to make cor r ect
predictions of the result.

However , bias h el ps in th e adjus tm ent of th e curve of activati on functi on


so as to accomplish a precise output.

Summatio n Functio n

After th e wei gh ts ar e assi gned to th e i nput, i t th en com putes th e pr oduct


of each input and wei gh ts. Th en th e wei gh ted s um is calcul ated b y th e
summation function in which all of the products are added .
Activation Function

The main objective of the activation function is to perform a


mapping of a weighted sum upon the output. The transformation
function comprises of activation functions such as tanh, ReLU,
sigmoid, etc.

The activation function is categorized into two main parts:

1.Linear Activation Function

2.Non-Linear Activation Function


Linear Activation Function
In the linear activation function, the output of
functions is not restricted in between any
range. Its range is specified from -infinity to
infinity. For each individual neuron, the inputs
get multiplied with the weight of each
respective neuron, which in turn leads to the
creation of output signal proportional to the
input. If all the input layers are linear in nature,
then the final activation of the last layer will
actually be the linear function of the initial
layer's input.
Non- linear function
These are one of the most widely used activation
function. It helps the model in generalizing and
adapting any sort of data in order to perform correct
differentiation among the output. It solves the
following problems faced by linear activation
functions:

•S i n c e t h e n o n - l i n e a r f u n c t i o n c o m e s u p w i t h
derivative functions, so the problems related to
backpropagation has been successfully solved.

•F o r t h e c r e a t i o n o f d e e p n e u r a l n e t w o r k s , i t p e r m i t s
the stacking up of several layers of the neurons.
1. Sigmoid or Logistic
Activation Function

It provides a smooth gradient by preventing


sudden jumps in the output values. It has an output
value range between 0 and 1 that helps in the
normalization of each neuron's output. For X, if it
has a value above 2 or below -2, then the values
of y will be much steeper. In simple language, it
means that even a small change in the X can bring
a lot of change in Y.
It's value ranges between 0 and 1 due to which it
is highly preferred by binary classification whose
result is either 0 or 1.
2. Tanh or Hyperbolic
Tangent Activation
Function

T h e tan h ac ti v ati o n f u n c t io n w o r k s m u c h
b e t t e r t h a n t h a t o f t h e s i g m oid f u n c t io n, o r
s i m p ly w e c a n s a y i t i s a n a d v a n c ed v e r s i on
o f t h e s i g m oid a c t i v a t i on f u n c t io n. S i n c e i t
h a s a v a l u e r a n g e b e t w e e n - 1 to 1 , so i t i s
u t i l ized b y t h e h i dde n l a y e r s i n t h e n e u r a l
network, and because of this reason, it has
m a d e t h e p r o c e s s o f l e a r n in g m u c h e a s i e r .
3. ReLU(Rectified Linear
Unit) Activation Function

R e L U i s o n e o f th e m o st w i d ely u se d
a c t i v a t i o n f u n c t i on b y t h e h i d den l a y e r i n t h e
n e u r a l n e t w o r k. I t s v a l u e r a n g e s f r o m 0 t o
i n f init y . I t c l e a r ly h e l p s i n s o l v ing o u t t h e
p r o b lem o f b a c k p r op a g a t io n. I t t e n d s o u t t o
b e m o r e e x p e n s i ve t h a n t h e s i g m oid, a s w e l l
a s t h e t a n h a c t i v a t i o n f u n c t io n. I t a l l o w s o n l y
a f e w n e u r o n s t o g e t a c t i v a t e d a t a p a r t i c u la r
instance that leads to effectual as well as
e a s i e r c o m p u t a t i ons .
4. Softmax Function

It is one of a kind of sigmoid function whereby solving the


problems of classifications. It is mainly used to handle multiple
classes for which it squeezes the output of each class between 0
and 1, followed by dividing it by the sum of outputs. This kind of
function is specially used by the classifier in the output layer.
Feed Forward /Forward
Propagation
“ T h e p r o c e s s o f r e c e iving a n i n p u t t o p r o d uce
some kind of output to make some kind of
p r e d iction i s k n o w n a s F e e d F o r w a r d." F e e d
Forward neural network is the core of many
other important neural networks such as
c o n v o lut io n n e u r a l n e t w o r k .
Each Input to an artificial neuron has a weight
associated with it. The inputs are first multiplied
with their respective weights and a bias is added to
t h e r e s u lt . w e c a n c a l l t h i s t h e w e i g h t e d s u m . T h e
weighted sum then goes through an activation
function.
So, An artificial neuron can be thought of as a
s i m p l e o r m u l t i p le l i n e a r r e g r e s si o n m o d e l w i t h a n
activation function at the end.
Feed Forward Propagation

Calculating the input data multiplied by the networks weight plus


the bias and then going through the activation function
Input values

X1=0.05

X2=0.10

Initial weight

W1=0.15 w5=0.40

W2=0.20 w6=0.45

W3=0.25 w7=0.50

W4=0.30 w8=0.55

Bias Values

b1=0.35 b2=0.60

Target Values

T1=0.01
T2=0.99
Forward Pass

To find the value of H1 we first multiply the input value from


the weights as

H1=x1×w1+x2×w2+b1

H1=0.05×0.15+0.10×0.20+0.35

H1=0.3775

To calculate the final result of H1, we performed the sigmoid


function as
We will calculate the value of H2 in the same way as H1

H2=x1×w 3 +x2×w 4 +b1


H2=0.05×0.25+0.10×0.30+0.35
H2=0.3925

To calculate the final result of H1, we performed the


sigmoid function as
Now, we calculate the values of y1 and y2 in the same way as we
calculate the H1 and H2.

To find the value of y1, we first multiply the input value i.e., the
outcom e of H1 and H2 from the weights as

y1=H1×w 5 +H2×w 6 +b2


y1=0.593269992 × 0.40+0.596884378 × 0.45+0.60
y1=1.10 590 597

To calculate the final result of y1 we perform ed the sigmoid function


as
We will calculate the value of y2 in the same way as y1

y2=H1×w 7 +H2×w 8 +b2


y2=0.593269992×0.50+0.596884378 ×0.55+0.60
y2=1.2249214

To calculate the final result of H1, we performed the sigmoid function as


Now, we will find the total error, which is simply the
difference between the outputs from the target outputs.
The total error is calculated as

So, the total error is

Now, we will backpropagate this error to update the


weights using a backward pass.
Backpropogation

Ba ck pr op a gati o n i s one of th e imp or tan t c on ce p ts of a n e ur al ne two rk . O ur


task i s to cl assify o ur d ata be st . F or this, w e h ave to u pd a t e the w eig h ts of
p ar am e ter and bias, b u t h ow c an w e do th at i n a de ep n eu r al n e twor k ? In the
line ar re gr e ssio n model , w e u se gr adien t d e sce n t to op timize the par ame ter .
S i m ilarly h e r e w e al so u se g r ad i ent d e sc en t al g o r ith m u si n g Bac k p r o pag ati on .

For a sin gle tr ainin g e xample , Ba ck pro p a g a t i on alg ori thm c alcul ate s th e
g r adien t o f t he e rr o r f u nct io n . Back pro pag ati on c an be w ri tten as a f u nc tion
o f the n eu r al n e two rk . Backp ro p ag ati on alg ori thms ar e a se t o f me thod s u sed
to ef ficien tly tr ai n ar tificial ne ur al n e tw or k s follo win g a g r adient de sc en t
ap p r o ac h w h i c h e xp l o its th e c h ai n r u l e .
The main features of Backpropagation are the iterative, recursive and
efficient method through which it calculates the updated weight to
improve the network until it is not able to perform the task for which it
is being trained. Derivatives of the activation function to be known at
network design time is required to Backpropagation.

Now, how error function is used in Backpropagation and how


Backpropagation works? Let start with an example and do it
mathematically to understand how exactly updates the weight using
Backpropagation .
Cost /Loss /Error function

When training a neural networ k, the cost value J quantifies the


networ k’s error, i.e., its output’s deviation from the ground truth. We
calculate it as the average error over all the objects in the training set,
and our goal is to minimize it.

For example, let’s say we have a networ k that classifies animals either
as cats or dogs. It has two output neurons 𝑦 1 and 𝑦 2 , where the former
represents the probability that the animal is a cat and the latter that
it’s a dog. Given an image of a cat, we expect 𝑦 1 = 1 and 𝑦 2 = 0.

However, if the networ k output s 𝑦 1 = 0 .25 and 𝑦 2 = 0.65 , we can


quantify our error on that image as the squar ed distance:
2 2
1 − 0 .25 + 0 − 0.65 = 0.85
1 2
𝐽 = ෍ 𝑦𝑖 − 𝑦ො𝑖
𝑛
𝑖

We use the cost to update the weights and biases so that the actual
outputs get as close as possible to the desired values. To decide
whether to increase or decrease a coefficient, we calculate its
partial derivative using backpropagation. Let’s explain it with an
example.
When training a neural network, the cost value J quantifies the
network’s error, i.e., its output’s deviation from the ground
truth. We calculate it as the average error over all the objects in
the training set, and our goal is to minimize it.

We update the weight according to following

𝜕𝐽
𝑤𝑛𝑒𝑤 = 𝑤 𝑜𝑙𝑑 − 𝜂
𝜕 𝑤 𝑜𝑙𝑑
Where 𝜂 is the learning rate.
EXAMPLE:
Let’s say we have only one neuron in the
input, hidden, and output layers:

To update the weights and biases, we need to


see how J reacts to small changes in those
parameters. We can do that by computing the
partial derivatives of J with respect to them.
But before that, let’s recap how the variables
in our problem are related:
To update the weights and biases, we
need to see how J reacts to small
changes in those parameters. We can do
that by computing the partial derivatives
of J with respect to them. But before
that, let’s recap how the variables in our
problem are related:
So, if we want to see how changing 𝑤 2 affects the cost function,
we should compute the partial derivative by applying the chain
rule of Calculus:

𝜕𝐽 𝜕𝑧2 𝜕𝑦 𝜕𝐽
=
𝜕𝑤2 𝜕𝑤2 𝜕𝑧2 𝜕𝑦
Dif ferent optimization algorithm are available for error
minimization:

• Gradient descent

• Adagrad

• Momentum

• Adam

• Ftrl

• RMSprop, etc
Types of Neural Network

• Multi-Layer Perceptrons (MLP)

• Convolutional Neural Networks (CNN)

• Recurrent Neural Networks (RNN)


Multilayer Perceptrons (MLPs)

• A m u lt ilay e r p e rce p t ro n ( M LP ) is a cla s s o f a fe e d fo rw a rd a rt ificia l n e u ra l


n e t w o rk ( A N N ) .

• MLP s m o d e ls a re t h e m o s t b a s ic d e e p n eu ra l n et w ork, w h ich is com posed of a


s e rie s o f fu lly co n n e ct e d la y e rs .

• T o d a y , M LP m a ch ine le a rn in g m e t h o ds ca n b e u s e d t o o v e rco m e t h e
re q u ire m e n t o f h ig h co m p u ting p o w e r re q u ire d b y m o d e rn d e e p le a rn in g
a rch it e ct ure s .

• E a ch n ew la y e r is a s e t o f n o n line a r fu n ct ions of a w eigh t ed s u m of a ll ou t puts


( fu lly co n n e ct e d ) fro m t h e p rio r o n e .
Convolutional Neural Network (CNN)

• A co n v o lut io nal n e u ra l n e t w o rk ( CN N , o r Co n v N e t ) is a n o t he r cla s s o f d e e p


n e u ra l n e t w o rks.

• CN N s a re m o s t co m m o nly e m p lo y e d in com put er v is ion . G iv en a s eries of


im a g e s o r v id e o s fro m t h e re a l w o rld , w it h t h e u t iliz a tio n o f CN N , t h e A I s y s t e m
le a rn s t o a u t oma tica lly e x t ra ct t h e fe a t u re s o f t h e s e in p u t s t o co m p le te a
s p e cific t a s k, e . g . , im a g e cla s s ifica tio n, fa ce a u t h e ntication , a n d im a g e
s e m a n tic s e g m e n tat io n.

• D ifferen t fro m fu lly co n n e ct e d la y e rs in MLP s , in CN N m od els , on e or m u lt iple


co n v o lution la y e rs e x t ra ct t h e s im p le fe a t u re s fro m in p u t b y e x e cu t in g
co n v o lution o p e ra t ion s. E a ch la y e r is a s e t o f n o n line a r fu n ct io ns o f w e ig h t e d
s u m s a t d iffe re n t co o rd in a te s o f s p a t ially n e a rb y s u b se t s o f o u t p uts fro m t h e
p rio r la y e r, w h ich a llo w s t h e w e ig h ts t o b e re u s e d .
It is a specialized type of neural network that can learn spatial hierarchies
of features directly from pixel values by using filters that scan over the
input image and extract relevant features.

The key components of a CNN include convolutional layers, pooling layers,


and fully connected layers . The convolutional layer is the core building
block of the CNN and is responsible for learning the features of the input
image. It applies a set of filters to the input image, each of which extracts a
specific feature, such as edges, textures, or shapes.
The pooling layer is used to downsample the output of the convolutional
layer and reduce the dimensionality of the features. This helps to reduce
overfitting and improve the efficiency of the network. The fully connected
layer is the last layer of the network and is used to classify the input image
based on the learned features.

During training , the CNN uses backpropagation to adjust the weights and
biases of the network to optimize its performance. This process involves
propagating the error back through the network and updating the
parameters of the network to minimize the error.
CNNs have been used to achieve state -of-the-art performance in a variety of
image recognition and classification tasks, including object detection, face
recognition, and image segmentation. They are also used in natural
language processing for tasks such as text classification and sentiment
analysis.

Overall, CNNs are a powerful tool for image analysis and recognition, and
they have shown great potential in many applications. Their ability to learn
spatial hierarchies of features directly from pixel values makes them well -
suited for a wide range of image -related tasks.
• AlexNet. For image classification, as the first CNN neural network to win the
ImageNet Challenge in 2012, AlexNet consists of five convolution layers and three
fully connected layers. Thus, AlexNet requires 61 million weights and 724 million
MACs (multiply-add computation) to classify the image with a size of 227 ×227.

• VGG-16. To achieve higher accuracy, VGG -16 is trained to a deeper structure of 16


layers consisting of 13 convolution layers and three fully connected layers,
requiring 138 million weights and 15.5G MACs to classify the image with a size of
224×224.
• G o o g le N e t . T o im p r o v e a c c u r a c y w h ile r e d u c in g t h e c o m p u tation o f D N N
in fe re n ce , G o o g le N e t in t ro d u ce s a n in ce p t io n m o d ule co m p o se d o f d iffe re n t
s iz e d filt e rs . A s a re s u lt , G o o g le N e t a ch ie v e s a b e t t e r a ccu ra cy p e rfo rm a n ce
t h a n V G G - 16 w h ile o n ly r e q u ir in g s e v e n m illio n w e ig h t s a n d 1 . 4 3 G M A Cs t o
p ro ce s s t h e im a g e w it h t h e s a m e s iz e .

• R e s N e t . R e s N e t , t h e s t a te - of- the -ar t e f f o r t , u s e s t h e “ s h or t c ut” s t r u c t ur e t o


re a ch a h u m an -le ve l a ccu ra cy w it h a t o p -5 e rro r ra t e b e lo w 5 % . I n a d d it ion, t h e
“ s h o rtcu t” m o d u le is u s e d t o s o lv e t h e g ra d ie n t v a n ishing p ro b le m d u rin g t h e
t ra in ing p ro ce s s , m a kin g it p o s s ib le t o t ra in a D N N m od el w it h a d eeper
s t ru ct ure . T h e p e rfo rm a n ce o f p o p u lar CN N s a p p lie d fo r A I v is io n t a s ks
g ra d u ally in cre a s e d o v e r t h e y e a rs , s u rp a ssing h u m a n v is io n ( 5 % e rro r ra t e in
t h e ch a rt b e lo w ) .
Recurrent Neural Network (RNN)

• A recurrent neural network (RNN) is another class of artificial neural


networks that use sequential data feeding. RNNs have been developed
to address the time -series problem of sequential input data.

• The input of RNN consists of the current input and the previous
samples. Therefore, the connections between nodes form a directed
graph along a temporal sequence. Furthermore, each neuron in an RNN
owns an internal memory that keeps the information of the computation
from the previous samples.
• R N N m o d e ls a re w id e ly u s e d in N a t u ra l La n g u age P ro ce s s in g ( N LP ) d u e t o t h e
s u p e rio rity o f p ro ce s s ing t h e d a t a w it h a n in p u t le n g t h t h a t is n o t fix e d . T h e
t a s k o f t h e A I h e re is t o b u ild a s y s t e m t h a t ca n co m p re h e n d n a t u ral la n g ua ge
s p o ke n b y h u m an s, e . g . , n a t u ra l la n g u age m o d e lin g, w o rd e m b e d d ing, a n d
m a ch ine t ra n s lation .

• I n R N N s , e a ch s u b s e q ue nt la y e r is a co lle ct io n o f n o n lin e ar fu n ct io ns o f
w e ig h t e d s u m s o f o u t p u ts a n d t h e p re v io u s s t a t e . T h u s , t h e b a s ic u n it o f R N N is
ca lle d “ ce ll” , a n d e a ch ce ll co n s is ts o f la y e rs a n d a s e rie s o f ce lls t h a t e n a b le s
t h e s e q u e n t ia l p ro ce s s ing o f re cu rre n t n e u ra l n e t w o rk m o d e ls.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy