Lecture8,9-Neural Networks
Lecture8,9-Neural Networks
Networks
• Most studied /researched topic
Neuron
Human Brain
• The human brain can learn from experience and train data
itself.
W e i gh t s a r e n u m e r i c v a l u es t h a t a r e m u l t i p l ied b y i n p u t s . I n b a c kp ropagati o n ,
t h e y a r e m o d i f i ed t o r e du ce t h e l o s s . I n s i m p l e w o r d s , w e i gh ts a r e m a c h i n e
l e a r ned v a l u es f r o m N e u r al N e two rks . T h e y s e l f - adj us t d e pendi n g o n t h e d i f f erence
b e tween p r e di cted o u t p u ts v s t r a i n in g i n p u t s .
A c t i v ation Fu n c t i o n i s a m a t h em ati cal f o r m u l a t h a t h e l p s t h e n e u r o n t o s w i t c h
O N / O FF.
•I n p u t l a y er r e p resen ts d i m e n s i o ns o f t h e i n p u t v e c to r.
•O u t p u t l a y er r e p resen ts t h e o u t p u t o f t h e n e u r al n e t w or k.
Single hidden layer Neural Network
Two hidden layer Neural Network
Importance of Neural Network
Without Neural Network: Let's have a look at the example given below. Here we have a
machine, such that we have trained it with four types of cats, as you can see in the
image below. And once we are done with the training, we will provide a random image
to that particular machine that has a cat. Since this cat is not similar to the cats
through which we have trained our system and the background is also changed, so
without the neural network, our machine would not identify the cat in the picture.
Basically, the machine will get confused in figuring out where the cat is.
With Neural Network: However, when we talk about the case with a neural network,
even if we have not trained our machine with that particular cat. But still, it can identify
certain features of a cat that we have trained on, and it can match those features with
the cat that is there in that particular image and can also identify the cat. So, with the
help of this example, you can clearly see the importance of the concept of a neural
network.
Working of Artificial Neural Network
Perceptron
Working of Artificial Neural Networks
1 .I n p u ts
2 .W e i g hts an d Bi as
3 .S u m m atio n Fu n c ti ons
4 .A c ti vati on o r tr an sf o r m atio n f u n c ti on
The main logic behind the concept of Perceptron is as follows:
The inputs (x) are fed into the input layer, which undergoes
multiplication with the allotted weights (w) followed by
experiencing addition in order to form weighted sums. Then these
inputs weighted sums with their corresponding weights are
executed on the pertinent activation function.
Weights and Bias
Summatio n Functio n
•S i n c e t h e n o n - l i n e a r f u n c t i o n c o m e s u p w i t h
derivative functions, so the problems related to
backpropagation has been successfully solved.
•F o r t h e c r e a t i o n o f d e e p n e u r a l n e t w o r k s , i t p e r m i t s
the stacking up of several layers of the neurons.
1. Sigmoid or Logistic
Activation Function
T h e tan h ac ti v ati o n f u n c t io n w o r k s m u c h
b e t t e r t h a n t h a t o f t h e s i g m oid f u n c t io n, o r
s i m p ly w e c a n s a y i t i s a n a d v a n c ed v e r s i on
o f t h e s i g m oid a c t i v a t i on f u n c t io n. S i n c e i t
h a s a v a l u e r a n g e b e t w e e n - 1 to 1 , so i t i s
u t i l ized b y t h e h i dde n l a y e r s i n t h e n e u r a l
network, and because of this reason, it has
m a d e t h e p r o c e s s o f l e a r n in g m u c h e a s i e r .
3. ReLU(Rectified Linear
Unit) Activation Function
R e L U i s o n e o f th e m o st w i d ely u se d
a c t i v a t i o n f u n c t i on b y t h e h i d den l a y e r i n t h e
n e u r a l n e t w o r k. I t s v a l u e r a n g e s f r o m 0 t o
i n f init y . I t c l e a r ly h e l p s i n s o l v ing o u t t h e
p r o b lem o f b a c k p r op a g a t io n. I t t e n d s o u t t o
b e m o r e e x p e n s i ve t h a n t h e s i g m oid, a s w e l l
a s t h e t a n h a c t i v a t i o n f u n c t io n. I t a l l o w s o n l y
a f e w n e u r o n s t o g e t a c t i v a t e d a t a p a r t i c u la r
instance that leads to effectual as well as
e a s i e r c o m p u t a t i ons .
4. Softmax Function
X1=0.05
X2=0.10
Initial weight
W1=0.15 w5=0.40
W2=0.20 w6=0.45
W3=0.25 w7=0.50
W4=0.30 w8=0.55
Bias Values
b1=0.35 b2=0.60
Target Values
T1=0.01
T2=0.99
Forward Pass
H1=x1×w1+x2×w2+b1
H1=0.05×0.15+0.10×0.20+0.35
H1=0.3775
To find the value of y1, we first multiply the input value i.e., the
outcom e of H1 and H2 from the weights as
For a sin gle tr ainin g e xample , Ba ck pro p a g a t i on alg ori thm c alcul ate s th e
g r adien t o f t he e rr o r f u nct io n . Back pro pag ati on c an be w ri tten as a f u nc tion
o f the n eu r al n e two rk . Backp ro p ag ati on alg ori thms ar e a se t o f me thod s u sed
to ef ficien tly tr ai n ar tificial ne ur al n e tw or k s follo win g a g r adient de sc en t
ap p r o ac h w h i c h e xp l o its th e c h ai n r u l e .
The main features of Backpropagation are the iterative, recursive and
efficient method through which it calculates the updated weight to
improve the network until it is not able to perform the task for which it
is being trained. Derivatives of the activation function to be known at
network design time is required to Backpropagation.
For example, let’s say we have a networ k that classifies animals either
as cats or dogs. It has two output neurons 𝑦 1 and 𝑦 2 , where the former
represents the probability that the animal is a cat and the latter that
it’s a dog. Given an image of a cat, we expect 𝑦 1 = 1 and 𝑦 2 = 0.
We use the cost to update the weights and biases so that the actual
outputs get as close as possible to the desired values. To decide
whether to increase or decrease a coefficient, we calculate its
partial derivative using backpropagation. Let’s explain it with an
example.
When training a neural network, the cost value J quantifies the
network’s error, i.e., its output’s deviation from the ground
truth. We calculate it as the average error over all the objects in
the training set, and our goal is to minimize it.
𝜕𝐽
𝑤𝑛𝑒𝑤 = 𝑤 𝑜𝑙𝑑 − 𝜂
𝜕 𝑤 𝑜𝑙𝑑
Where 𝜂 is the learning rate.
EXAMPLE:
Let’s say we have only one neuron in the
input, hidden, and output layers:
𝜕𝐽 𝜕𝑧2 𝜕𝑦 𝜕𝐽
=
𝜕𝑤2 𝜕𝑤2 𝜕𝑧2 𝜕𝑦
Dif ferent optimization algorithm are available for error
minimization:
• Gradient descent
• Adagrad
• Momentum
• Adam
• Ftrl
• RMSprop, etc
Types of Neural Network
• T o d a y , M LP m a ch ine le a rn in g m e t h o ds ca n b e u s e d t o o v e rco m e t h e
re q u ire m e n t o f h ig h co m p u ting p o w e r re q u ire d b y m o d e rn d e e p le a rn in g
a rch it e ct ure s .
During training , the CNN uses backpropagation to adjust the weights and
biases of the network to optimize its performance. This process involves
propagating the error back through the network and updating the
parameters of the network to minimize the error.
CNNs have been used to achieve state -of-the-art performance in a variety of
image recognition and classification tasks, including object detection, face
recognition, and image segmentation. They are also used in natural
language processing for tasks such as text classification and sentiment
analysis.
Overall, CNNs are a powerful tool for image analysis and recognition, and
they have shown great potential in many applications. Their ability to learn
spatial hierarchies of features directly from pixel values makes them well -
suited for a wide range of image -related tasks.
• AlexNet. For image classification, as the first CNN neural network to win the
ImageNet Challenge in 2012, AlexNet consists of five convolution layers and three
fully connected layers. Thus, AlexNet requires 61 million weights and 724 million
MACs (multiply-add computation) to classify the image with a size of 227 ×227.
• The input of RNN consists of the current input and the previous
samples. Therefore, the connections between nodes form a directed
graph along a temporal sequence. Furthermore, each neuron in an RNN
owns an internal memory that keeps the information of the computation
from the previous samples.
• R N N m o d e ls a re w id e ly u s e d in N a t u ra l La n g u age P ro ce s s in g ( N LP ) d u e t o t h e
s u p e rio rity o f p ro ce s s ing t h e d a t a w it h a n in p u t le n g t h t h a t is n o t fix e d . T h e
t a s k o f t h e A I h e re is t o b u ild a s y s t e m t h a t ca n co m p re h e n d n a t u ral la n g ua ge
s p o ke n b y h u m an s, e . g . , n a t u ra l la n g u age m o d e lin g, w o rd e m b e d d ing, a n d
m a ch ine t ra n s lation .
• I n R N N s , e a ch s u b s e q ue nt la y e r is a co lle ct io n o f n o n lin e ar fu n ct io ns o f
w e ig h t e d s u m s o f o u t p u ts a n d t h e p re v io u s s t a t e . T h u s , t h e b a s ic u n it o f R N N is
ca lle d “ ce ll” , a n d e a ch ce ll co n s is ts o f la y e rs a n d a s e rie s o f ce lls t h a t e n a b le s
t h e s e q u e n t ia l p ro ce s s ing o f re cu rre n t n e u ra l n e t w o rk m o d e ls.