0% found this document useful (0 votes)
8 views

Unit 1 Deep Learning

Uploaded by

sahil.utube2003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Unit 1 Deep Learning

Uploaded by

sahil.utube2003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

1.

1-Basic of Deep learning


Biological Neural Network

 Our brain is a neural network, which is full of neurons and each neuron is connected
with multiple neurons.
 Dendrite collects the input signals which are summed up in the cell body and later
transmitted to the next neuron through the axon.

 The neuron is the fundamental building block of neural networks. In biological


systems, a neuron is a cell just like any other cell of the body, which has a DNA
code and is generated in the same way as the other cells. Though it might have
different DNA, the function is similar in all the organisms.

 A neuron comprises three major parts: the cell body (also called Soma), the
dendrites, and the axon. The dendrites are like fibers branched in different
directions and are connected to many cells in that cluster.

 Dendrites receive the signals from surrounding neurons, and the axon transmits the
signal to the other neurons. At the ending terminal of the axon, the contact with the
dendrite is made through a synapse.

 Axon is a long fibre that transports the output signal as electric impulses along its
length. Each neuron has one axon. Axons pass impulses from one neuron to another
like a domino effect.

Artificial Neuron Network


 Neural networks, also known as artificial neural networks (ANNs) or simulated neural
networks (SNNs), are a subset of machine learning and are at the heart of deep
learning algorithms. Their name and structure are inspired by the human brain.
 Artificial neural networks (ANNs) are comprised of node layers, containing an input
layer, one or more hidden layers, and an output layer.
 Each node, or artificial neuron, connects to another and has an associated weight and
threshold.
 If the output of any individual node is above the specified threshold value, that node
is activated, sending data to the next layer of the network. Otherwise, no data is
passed along to the next layer of the network.
 Perceptron receives multiple inputs which are then processed through functions to get
output. But in case of artificial neuron network weight are assigned to various
neurons. Then the final layer everything is put together to come up with answers.

How Neural Network Works


 Human brain is the inspiration behind neural network. Human brains cells called
neurons, from a complex, highly interconnected network and send electrical signals to
each other to help human process the information.
 Similarly, ANN is made up of artificial neurons that works together to solve the
problems.
Input Layer

 Information from the outside world enters the artificial neural network from the input
layer. Input nodes process the data, analyze or categorize it, and pass it on to the next
layer.

Hidden Layer

 Hidden layers take their input from the input layer or other hidden layers. Artificial
neural networks can have a large number of hidden layers. Each hidden layer analyses
the output from the previous layer, processes it further, and passes it on to the next
layer.

Output Layer

 The output layer gives the final result of all the data processing by the artificial neural
network. It can have single or multiple nodes. For instance, if we have a binary
(yes/no) classification problem, the output layer will have one output node, which will
give the result as 1 or 0. However, if we have a multi-class classification problem, the
output layer might consist of more than one output node.

Feed Forward Neural Network

 It processes the data in one direction from input node to output node. Every node in
one layer is connected to every node in the next layer.

Back-Propagation

 Artificial neural network learns continuously by using corrective feedback loops to


improve their predictive analysis.
 In simple terms you can think of the data flowing from input node to output node
through many different paths in the neural network. Only one path is the correct one
that maps the input node to the output node.

1.2-McCulloch Pits Neuron


 The motivation behind the McCulloch Pitts model is a biological neuron. A
biological neuron takes an input signal from dendrites and after processing it
passes into their other connected neurons as the output.
 If signals are received positively through axons and synapses. This is the basic
working of biological neurons which is interpreted and mimicked using the
McCulloch Pitts model.
 McCulloch Pitt’s model of neurons is a fairly simple model which consists of
some ‘n’ binary input with some weight associated with each one of them.
 The McCulloch-Pitts neural model, has only two types of inputs — Excitatory
and Inhibitory.
 The excitatory inputs have weights of positive magnitude and the inhibitory
weights have weights of negative magnitude.
 The inputs of the McCulloch-Pitts neuron could be either 0 or 1. It has a
threshold function as an activation function. So, the output signal yout is 1 if the
input ysum is greater than or equal to a given threshold value, else 0
 Simple McCulloch-Pitts neurons can be used to design logical operations. For
that purpose, the connection weights need to be correctly decided along with the
threshold function (rather than the threshold value of the activation function).

 Then we have a summation junction that aggregates all the weighted input and then
passes the result to the activation functions.
 The activation function is a threshold function that gives out 1 as the output if the sum
of the weighted input is equal or above the threshold value and 0 otherwise.

So let’s say we have n inputs = { X1, X2, X3, …. , Xn }


And we have n weights for each= {W1, W2, W3, …., W4}
So the summation of weighted inputs X.W = X1.W1 + X2.W2 + X3.W3 +....+ Xn.Wn

If X ≥ ø(threshold value)
Output = 1
Else
Output = 0

Let’s take a real-world example:


A bank wants to decide if it can sanction a loan or not. There are 2 parameters to decide-
Salary and Credit Score. So there can be 4 scenarios to assess-
1. High Salary and Good Credit Score
2. High Salary and Bad Credit Score
3. Low Salary and Good Credit Score
4. Low Salary and Bad Credit Score
Let X1 = 1 denote high salary and X1 = 0 denote Low salary and X2 = 1 denote good credit
score and X2 = 0 denote bad credit score
Let the threshold value be 2. The truth table is as follows

X1 X2 X1+X2 Loan Approved

1 1 2 1

1 0 1 1
0 1 1 1

0 0 0 0

The truth table shows when the loan should be approved considering all the varying
scenarios. In this case, the loan is approved only if the salary is high and the credit score is
good. The McCulloch Pitt's model of neuron was mankind’s first attempt at mimicking the
human brain. And it was a fairly simple one too. It’s no surprise it had many limitations-
1. The model failed to capture and compute cases of non-binary inputs. It was limited by
its ability to compute every case with 0 and 1 only.
2. The threshold had to be decided beforehand and needed manual computation instead of
the model deciding itself.
3. Linearly separable functions couldn’t be computed.

1.3-Threshold Logic of McCulloch Pitts Neuron Network.

OR Function

The inputs in OR Function is obviously Boolean, so only 4 combinations are possible — (0,0),
(0,1), (1,0) and (1,1).

Now plotting them on a 2D graph and making use of the OR function’s aggregation equation
i.e., x_1 + x_2 ≥ 1 using which we can draw the decision boundary as shown in the graph
below.

Truth Table of OR Function

X1 X2 Output

0 0 0

0 1 1

1 0 1

1 1 1
We just used the aggregation equation i.e., x_1 + x_2 =1 to graphically show that all those

inputs whose output when passed through the OR function M-P neuron lie ON or ABOVE that

line and all the input points that lie BELOW that line are going to output 0. The M-P neuron

just learn a linear decision boundary! The M-P neuron is splitting the input sets into two

classes — positive and negative. Positive ones (which output 1) are those that lie ON or

ABOVE the decision boundary and negative ones (which output 0) are those that lie BELOW

the decision boundary.

AND Function

Truth Table of AND Function

X1 X2 Output

0 0 0

0 1 0

1 0 0

1 1 1
Similar to OR Function, we can plot the graph for AND function considering the equation is
x_1+x_2=2.

In this case, the decision boundary equation is x_1 + x_2 =2. Here, all the input points that lie
ON or ABOVE, just (1,1), output 1 when passed through the AND function M-P neuron. It
fits! The decision boundary works!

Here, the decision boundary separates all the input points that lie ON or ABOVE and give
output 1 with just (1,1) when passed through the AND function.
From these examples, we can understand that with an increase in the number of inputs, the
dimensions which are plotted on the graph will also increase.

1.4-Perceptrons
 A human neuron collects the input from other neurons using dendrites and sums all the
inputs. If the total is greater than threshold value it produces the output.
 A perceptron is a mathematical model of a neuron. It receives weighted inputs which
are added together and passed to an activation function that determines whether the
neuron should fire and produce the inputs.

 Many activation functions with different properties, but one of the simplest is step
function. A step function output is 1 if the input is higher than threshold value
otherwise will be 0.
 Input Nodes or Input Layer:

This is the primary component of Perceptron which accepts the initial data into the system for
further processing. Each input node contains a real numerical value.

o Wight and Bias:

Weight parameter represents the strength of the connection between units. This is another
most important parameter of Perceptron components. Weight is directly proportional to the
strength of the associated input neuron in deciding the output. Further, Bias can be considered
as the line of intercept in a linear equation.

o Activation Function:

These are the final and important components that help to determine whether the neuron will
fire or not. Activation Function can be considered primarily as a step function.

Types of Activation functions:

o Sign function
o Step function, and
o Sigmoid function

Example of Perceptrons

Inputs

X1=0.9, X2=0.7

Weights

W1=0.2, W2=0.9

The activation function threshold is equal to 0.75 then

X1W1+X2W2=0.9*0.2+0.7*0.9
X1W1+X2W2=0.81

Result-Neuron is fire and output is 1.

How does Perceptrons Works


 Perceptron is considered as a single-layer neural network that consists of four
main parameters named input values (Input nodes), weights and Bias, net sum,
and an activation function.
 The perceptron model begins with the multiplication of all input values and their
weights, then adds these values together to create the weighted sum.
 Then this weighted sum is applied to the activation function 'f' to obtain the
desired output. This activation function is also known as the step function and is
represented by 'f'.

 This step function or Activation function plays a vital role in ensuring that output
is mapped between required values (0,1) or (-1,1).
 It is important to note that the weight of input is indicative of the strength of a
node. Similarly, an input's bias value gives the ability to shift the activation
function curve up or down.
 Perceptron model works in two important steps as follows:

Step-1
 In the first step first, multiply all input values with corresponding weight values
and then add them to determine the weighted sum. Mathematically, we can
calculate the weighted sum as follows:

∑wi*xi = x1*w1 + x2*w2 +…wn*xn

 Add a special term called bias 'b' to this weighted sum to improve the model's
performance.

∑wi*xi + b

Step-2

 In the second step, an activation function is applied with the above-mentioned


weighted sum, which gives us output either in binary form or a continuous value
as follows:

Y = f(∑wi*xi + b)

1.5-Perceptrons Learning Algorithms

 The perceptron model is a more general computational model than McCulloch-


Pitts neuron. It takes an input, aggregates it (weighted sum) and returns 1 only if
the aggregated sum is more than some threshold else returns 0.
 Rewriting the threshold as shown above and making it a constant input with a
variable weight, we would end up with something like the following:
Vector

a vector is anything that sits anywhere in space, and has a magnitude and a direction.

Dot Product of Two Vectors

Imagine you have two vectors oh size n+1, w and x, the dot product of these vectors (w.x)
could be computed as follows:

Angle Between Two Vectors

Now the same old dot product can be computed differently if only you knew the angle
between the vectors and their individual magnitudes.
The other way around, you can get the angle between two vectors, if only you knew the
vectors, given you know how to calculate vector magnitudes and their dot product.

Our goal is to find the w vector that can perfectly classify positive inputs and negative inputs
in our data.
 We initialize w with some random vector. We then iterate over all the examples in the
data, (P U N) both positive and negative examples.
 Now if an input x belongs to P, ideally what should the dot product w.x be? I’d say
greater than or equal to 0 because that’s the only thing what our perceptron wants at
the end of the day so let's give it that. And if x belongs to N, the dot product MUST be
less than 0.

Case 1: When x belongs to P and its dot product w.x < 0


Case 2: When x belongs to N and its dot product w.x ≥ 0

We have already established that when x belongs to P, we want w.x > 0, basic perceptron rule.

What we also mean by that is that when x belongs to P, the angle between w and x should be

less than 90 degrees.


So, whatever the w vector may be, as long as it makes an angle less than 90 degrees with the

positive example data vectors (x E P) and an angle more than 90 degrees with the negative

example data vectors (x E N)

So, we now strongly believe that the angle between w and x should be less than 90
when x belongs to P class and the angle between them should be more than 90 when x belongs
to N class.
So, when we are adding x to w, which we do when x belongs to P and w.x < 0 (Case 1), we
are essentially increasing the cos(alpha) value, which means, we are decreasing
the alpha value, the angle between w and x, which is what we desire. And the similar
intuition works for the case when x belongs to N and w.x ≥ 0 (Case 2).

1.6-Sigmoid Neuron

 Sigmoid neuron is similar to perceptron but they are slightly modified such that the
output from sigmoid neurons is much smoother than step functional output from
perceptron.
 Introducing sigmoid neurons where the output function is much smoother than step
function. In the sigmoid neuron, a small change in the input only causes a small
change in the output as opposed to the steeped function output.
 There are many functions with the characterizes of an “S” shaped curve known as
sigmoid functions. The most used function is the logistic function.
 We no longer a see a Sharpe transition at threshold “b”. The output from the
sigmoid neuron is not 0 or 1.Instead it is a real value between 0-1 which can be
interpreted as a probability.

 In a sigmoid neuron for every input Xi, its weight Wi associated with it. The
weight depict the importance of the input in the decision making process. The
outputs from sigmoid ranges between 0 to 1, which can be interpret as a probability
rather than 0 or 1 like in the perceptron model.
 In a case of 1-dimensional input X,the sigmoid function that best describe the
relationship between input and output is given by
 In case of 2-dimensional input i.e. 2 input features the sigmoid function that best
describe the input-output relation given by

 In case of highly dimensional input with many input features the sigmoid function
given by

 Advantages
 Unlike perceptron and M-P neurons has binary output. The sigmoid function
output lies between 0 and 1.
 Another advantage of using sigmoid function is that it deals with data that are not
linearly separable. A sigmoid neuron cannot complete separate positive points
from negative points.
 Loss function-is the sum of the square difference between true output and predicted
output.

Yi=True Output
Yi (Cap) =Predicated Output

1.7-Multilayer Perceptron’s (MLP)

 Multilayer perceptron’s has one input layer and each input ne neuron. It has one
output layer with single node for each output and it can have any number of hidden
layers and each hidden layer can have any number of nodes.
 MLP network are used for supervised learning format. Atypical learning algorithm for
MLP network is also called as back propagation algorithm.
 MLP is a feed forward neural network which means that the data layer to the output
layer in the forward direction.

 There are three inputs thus three input node and the hidden layer has three nodes. The
output layer gives two outputs therefore there are two output nodes. The nodes in the
input layer takes input and forward it for further process.
 Every node in the multi-layer perceptron’s uses a sigmoid activation functions. The
sigmoid activation function takes the real value as input and convert it into a number
between 0 and 1.

Backpropagation

 Backpropagation is the learning mechanism that allows the Multilayer Perceptron to


iteratively adjust the weights in the network, with the goal of minimizing the cost
function.
 In an Artificial Neural Network (ANN) the values of weight and bias are randomly
initialized. Due to random initiation, the neural network gives errors in the giving the
connected output. We need to reduce errors values as much as possible so to reduce
these error values we need a mechanism that can compared the desired output of the
neural network with the network output that consists of errors and adjust its weights
and bias such such that it gets closer to the desired output after each iteration.
 For this we train the network such that it back propagates and update the weight and
bias.
 The principle behind the backpropagation is to reduce errors values in randomly
allocated weight and bias such that it produces the connected output. We need to
update weight so that we get the global loss minimize.
 According to Algorithms how Back propagation works

1. After calculating the output from the MLP neural network, calculate the errors.
2. This error is the difference between the output generated by neural network and
actual output. To calculated the errors is feedback to the network from output to
hidden layers.
3. Now the output becomes input to the network.
4. The model reduces errors by adjusting the weights in the hidden layers.
5. Calculated the predicted output with adjusted weight and check errors. This
process continuously used till there is a minimize or no error.

1.8-Resprestion Power of Multiple Perceptron’s (MLP)

XOR function

 The XOR function takes two binary inputs (0 or 1) and returns 1 if the inputs are
different (one is 0 and the other is 1); otherwise, it returns 0.
 The XOR function cannot be implemented using a single perceptron since it is not a
linearly separable function. However, we can achieve the XOR functionality by
combining multiple perceptron’s in a network.
 A perceptron makes decisions based on a linear combination of its inputs, followed by
applying an activation function.
 The decision boundary of a perceptron is a hyperplane, which is a straight line in two-
dimensional space.
 However, the XOR function requires a non-linear decision boundary, which cannot be
achieved by a single perceptron.
 To implement the XOR function, we need a network of perceptrons, specifically a
multilayer perceptron (MLP) with at least one hidden layer.
 The hidden layer(s) allows the network to learn non-linear mappings between the
inputs and outputs.
 By combining multiple perceptrons and using non-linear activation functions, an MLP
can model complex relationships and achieve the necessary non-linear decision
boundaries to implement the XOR function.

XOR function Truth Table

X1 X2 Y

0 0 0

0 1 1

1 0 1

1 1 0
Conditions for the implementation of XOR:

w1<w0

w2≥w0

w3≥w0

w4<w0

Let W0 be the bias output of the neuron

X1 X2 XOR H1 H2 H3 H4 Output

0 0 0 1 0 0 0 W1

0 1 1 0 1 0 0 W2

1 0 1 0 0 1 0 W3

1 1 0 0 0 0 1 W4

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy