Unit 4

Unit1
2. (a) Discuss Hebbian Learning rule and winner-take all learning rule.
ANN or artificial neural networks are the computing systems developed by taking inspiration from
the biological neural networks; the human brain being its major unit. These neural networks are
made functional with the help of training that follows some kind of a learning rule.
Types Of Learning Rules in ANN
Learning rule enhances the Artificial Neural Network’s performance by applying this rule over the
network. Thus learning rule updates the weights and bias levels of a network when certain
conditions are met in the training process. it is a crucial part of the development of the Neural
Network.
Types Of Learning Rules in ANN
Hebbian Learning Rule
Developed by Donald Hebb in 1949, Hebian learning is an unsupervised learning rule that works by
adjusting the weights between two neurons in proportion to the product of their activation.
According to this rule, the weight between two neurons is decreased if they are working in opposite
directions and vice-versa. However, if there is no correlation between the signals, then the weight
remains the same. As for the sign a weight carries, it is positive when both the nodes are either
positive or negative. However, in case of a one node being either positive or negative, the weight
carries a negative sign.
Formula
Δw = αxiy
where,
• Δw is the change in weight
• α is the learning rate
• xi is the input vector and,
• y is the output
Perceptron Learning Rule
Developed by Rosenblatt, the perceptron learning rule is an error correction rule, used in a single
layer feed forward network. Like Hebbian learning, this also, is a supervised learning rule.
This rule works by finding the difference between the actual and desired outputs and adjusts the
weights according to that. Naturally, the rule requires the input of a set of vectors, along with the
weights so that it can produce an output.
Formula
w = w + η(y -ŷ)x
• w is the weight
• η is the learning rate
• x is the input vector
• y is the actual label of the input vector
• ŷ is the predicted label of the input vector
Correlation Learning Rule
With a similar principle as the Hebbian rule, the correction rule also increases or decreases the
weights based on the phases of the two neurons.
If the neurons are in the opposite phase to each other, the weight should be towards the negative
side and if they are in the same phase to each other, the weight should be towards the positive side.
The only thing that makes this rule different from the Hebbian learning rule is that it is supervised in
nature.
Formula
Δw = αxitj
• Δw is the change in weight
• α is the learning rate
• xi is the input vector and,

• tj is the target value
(b) What do you mean by Perceptron ? Explain Perceptron algorithm.
Perceptron
Perceptron was introduced by Frank Rosenblatt in 1957. He proposed a Perceptron learning rule
based on the original MCP neuron. A Perceptron is an algorithm for supervised learning of binary
classifiers. This algorithm enables neurons to learn and processes elements in the training set one at
a time.
Basic Components of Perceptron
Perceptron is a type of artificial neural network, which is a fundamental concept in machine learning.
The basic components of a perceptron are:
1. Input Layer: The input layer consists of one or more input neurons, which receive input
signals from the external world or from other layers of the neural network.
2. Weights: Each input neuron is associated with a weight, which represents the strength of the
connection between the input neuron and the output neuron.
3. Bias: A bias term is added to the input layer to provide the perceptron with additional
flexibility in modeling complex patterns in the input data.
4. Activation Function: The activation function determines the output of the perceptron based
on the weighted sum of the inputs and the bias term. Common activation functions used in
perceptrons include the step function, sigmoid function, and ReLU function.
5. Output: The output of the perceptron is a single binary value, either 0 or 1, which indicates
the class or category to which the input data belongs.
Types of Perceptron:
1. Single layer: Single layer perceptron can learn only linearly separable patterns.
2. Multilayer: Multilayer perceptrons can learn about two or more layers having a greater
processing power.
The Perceptron algorithm learns the weights for the input signals in order to draw a linear decision
boundary.
Advantages:
• A multi-layered perceptron model can solve complex non-linear problems.
• It works well with both small and large input data.
• Helps us to obtain quick predictions after the training.
• Helps us obtain the same accuracy ratio with big and small data.
Disadvantages:
• In multi-layered perceptron model, computations are time-consuming and complex.
• It is tough to predict how much the dependent variable affects each independent variable.
• The model functioning depends on the quality of training.
2-Explain various mathematical models of Neurons with their learning rules. Discuss in detail
Gradient Descent Algorithm.
Gradient Descent in Machine Learning
Gradient Descent is known as one of the most commonly used optimization algorithms to train
machine learning models by means of minimizing errors between actual and expected results.
Further, gradient descent is also used to train Neural Networks.
The best way to define the local minimum or local maximum of a function using gradient descent is
as follows:
o If we move towards a negative gradient or away from the gradient of the function at the
current point, it will give the local minimum of that function.
o Whenever we move towards a positive gradient or towards the gradient of the function at
the current point, we will get the local maximum of that function.
This entire procedure is known as Gradient Ascent, which is also known as steepest descent. The
main objective of using a gradient descent algorithm is to minimize the cost function using
iteration. To achieve this goal, it performs two steps iteratively:
How does Gradient Descent work?
Before starting the working principle of gradient descent, we should know some basic concepts to
find out the slope of a line from linear regression. The equation for simple linear regression is given
as:
1. Y=mX+c
Where 'm' represents the slope of the line, and 'c' represents the intercepts on the y-axis.
The starting point(shown in above fig.) is used to evaluate the performance as it is considered just as
an arbitrary point. At this starting point, we will derive the first derivative or slope and then use a
tangent line to calculate the steepness of this slope. Further, this slope will inform the updates to the
parameters (weights and bias).
Types of Gradient Descent

Based on the error in various training models, the Gradient Descent learning algorithm can be
divided into Batch gradient descent, stochastic gradient descent, and mini-batch gradient
descent. Let's understand these different types of gradient descent:
1. Batch Gradient Descent:
Batch gradient descent (BGD) is used to find the error for each point in the training set and update
the model after evaluating all training examples. This procedure is known as the training epoch. In
simple words, it is a greedy approach where we have to sum over all examples for each update.
Advantages of Batch gradient descent:
o It produces less noise in comparison to other gradient descent.
o It produces stable gradient descent convergence.
o It is Computationally efficient as all resources are used for all training samples.
2. Stochastic gradient descent
Stochastic gradient descent (SGD) is a type of gradient descent that runs one training example per
iteration. Or in other words, it processes a training epoch for each example within a dataset and
updates each training example's parameters one at a time. As it requires only one training example
at a time, hence it is easier to store in allocated memory.
Advantages of Stochastic gradient descent:
In Stochastic gradient descent (SGD), learning happens on every example, and it consists of a few
advantages over other gradient descent.
o It is easier to allocate in desired memory.
o It is relatively fast to compute than batch gradient descent.
o It is more efficient for large datasets.
3. MiniBatch Gradient Descent:
Mini Batch gradient descent is the combination of both batch gradient descent and stochastic
gradient descent. It divides the training datasets into small batch sizes then performs the updates on
those batches separately. Splitting training datasets into smaller batches make a balance to maintain
the computational efficiency of batch gradient descent and speed of stochastic gradient descent.
Advantages of Mini Batch gradient descent:
o It is easier to fit in allocated memory.

o It is computationally efficient.
o It produces stable gradient descent convergence.
3-Explain Back propagation algorithm with its architecture and learning rules.
What is backpropagation?
• In machine learning, backpropagation is an effective algorithm used to train artificial neural

networks, especially in feed-forward neural networks.
• Backpropagation is an iterative algorithm, that helps to minimize the cost function by

determining which weights and biases should be adjusted. During every epoch, the model
learns by adapting the weights and biases to minimize the loss by moving down toward the
gradient of the error. Thus, it involves the two most popular optimization algorithms, such
as gradient descent or stochastic gradient descent.
• Computing the gradient in the backpropagation algorithm helps to minimize the cost
function and it can be implemented by using the mathematical rule called chain rule from
calculus to navigate through complex layers of the neural network.
Working of Backpropagation Algorithm
The Backpropagation algorithm works by two different passes, they are:
• Forward pass
• Backward pass
How does Forward pass work?
• In forward pass, initially the input is fed into the input layer. Since the inputs are raw data,
they can be used for training our neural network.
• The inputs and their corresponding weights are passed to the hidden layer. The hidden layer
performs the computation on the data it receives. If there are two hidden layers in the neural
network, for instance, consider the illustration fig(a), h1 and h2 are the two hidden layers,
and the output of h1 can be used as an input of h2. Before applying it to the activation
function, the bias is added.
How does backward pass work?
• In the backward pass process shows, the error is transmitted back to the network which
helps the network, to improve its performance by learning and adjusting the internal
weights.
• To find the error generated through the process of forward pass, we can use one of the most
commonly used methods called mean squared error which calculates the difference between
the predicted output and desired output
2. Differentiate between Biological neural network & ANN. Explain Hopfield network with
algorithm and learning rule.
1. Artificial Neural Network: Artificial Neural Network (ANN) is a type of neural network that is
based on a Feed-Forward strategy. It is called this because they pass information through the nodes
continuously till it reaches the output node. This is also known as the simplest type of neural
network. Some advantages of ANN :
• Ability to learn irrespective of the type of data (Linear or Non-Linear).
• ANN is highly volatile and serves best in financial time series forecasting.
Some disadvantages of ANN :
• The simplest architecture makes it difficult to explain the behavior of the network.
• This network is dependent on hardware.
2. Biological Neural Network: Biological Neural Network (BNN) is a structure that consists of
Synapse, dendrites, cell body, and axon. In this neural network, the processing is carried out by
neurons. Dendrites receive signals from other neurons, Soma sums all the incoming signals and axon
transmits the signals to other cells.
Some advantages of BNN :
• The synapses are the input processing element.
• It is able to process highly complex parallel inputs.

Some disadvantages of BNN :
• There is no controlling mechanism.
• Speed of processing is slow being it is complex.
Differences between ANN and BNN :
Biological Neural Networks (BNNs) and Artificial Neural Networks (ANNs) are both composed of
similar basic components, but there are some differences between them.
Neurons: In both BNNs and ANNs, neurons are the basic building blocks that process and transmit
information. However, BNN neurons are more complex and diverse than ANNs. In BNNs, neurons
have multiple dendrites that receive input from multiple sources, and the axons transmit signals to
other neurons, while in ANNs, neurons are simplified and usually only have a single output.
Synapses: In both BNNs and ANNs, synapses are the points of connection between neurons, where
information is transmitted. However, in ANNs, the connections between neurons are usually fixed,
and the strength of the connections is determined by a set of weights, while in BNNs, the
connections between neurons are more flexible, and the strength of the connections can be
modified by a variety of factors, including learning and experience.
Neural Pathways: In both BNNs and ANNs, neural pathways are the connections between neurons
that allow information to be transmitted throughout the network. However, in BNNs, neural
pathways are highly complex and diverse, and the connections between neurons can be modified by
experience and learning. In ANNs, neural pathways are usually simpler and predetermined by the
architecture of the network.
Parameters ANN BNN
input dendrites
weight synapse
Structure
output axon
hidden layer cell body
Learning very precise structures and formatted data they can tolerate ambiguity
complex simple
Processor high speed low speed
one or a few large number

separate from a processor integrated into processor
Memory localized distributed
non-content addressable content-addressable
centralized distributed
Computing sequential parallel
stored programs self-learning
Head-to-head comparison between Artificial Neural Network and Biological Neural Network
Here, you will learn head-to-head comparisons between Artificial Neural Network and Biological
Neural Network. The main differences between Artificial Neural Network and Biological Neural
Network are as follows:
Features Artificial Neural Network Biological Neural Network
Definition It is the mathematical model which is It is also composed of several processing

mainly inspired by the biological neuron pieces known as neurons that are linked
system in the human brain. together via synapses.
Processing Its processing was sequential and It processes the information in a parallel
centralized. and distributive manner.
Size It is small in size. It is large in size.
Control Its control unit keeps track of all computer- All processing is managed centrally.
Mechanism related operations.
Rate It processes the information at a faster It processes the information at a slow

speed. speed.
Explain Hopfield network with algorithm and learning rule.
Hopfield neural network was invented by Dr. John J. Hopfield in 1982. It consists of
a single layer which contains one or more fully connected recurrent neurons. The
Hopfield network is commonly used for auto-association and optimization tasks.
Discrete Hopfield Network

A Hopfield network which operates in a discrete line fashion or in other words, it
can be said the input and output patterns are discrete vector, which can be either
binary 0,10,1 or bipolar +1,−1+1,−1 in nature. The network has symmetrical
weights with no self-connections i.e., wij = wji and wii = 0.
Architecture
Following are some important points to keep in mind about discrete Hopfield
network −
• This model consists of neurons with one inverting and one non-inverting
output.
• The output of each neuron should be the input of other neurons but not the
input of self.
• Weight/connection strength is represented by wij.
• Connections can be excitatory as well as inhibitory. It would be excitatory, if
the output of the neuron is same as the input, otherwise inhibitory.
• Weights should be symmetrical, i.e. wij = wji
The output from Y1 going to Y2, Yi and Yn have the
weights w12, w1i and w1n respectively. Similarly, other arcs have the weights on
them.
Training Algorithm
During training of discrete Hopfield network, weights will be updated. As we know

that we can have the binary input vectors as well as bipolar input vectors. Hence,
in both the cases, weight updates can be done with the following relation
Case 1 − Binary input patterns
For a set of binary patterns sp𝑝, p = 1 to P
Here, sp𝑝 = s1p𝑝, s2p𝑝,..., sip𝑝,..., snp𝑝
Weight Matrix is given by
wij=∑p=1P[2si(p)−1][2sj(p)−1]fori≠j𝑤𝑖𝑗=∑𝑝=1𝑃[2𝑠𝑖(𝑝)−1][2𝑠𝑗(𝑝)−1]𝑓𝑜𝑟𝑖≠𝑗
Case 2 − Bipolar input patterns
For a set of binary patterns sp𝑝, p = 1 to P
Here, sp𝑝 = s1p𝑝, s2p𝑝,..., sip𝑝,..., snp𝑝
Weight Matrix is given by
wij=∑p=1P[si(p)][sj(p)]fori≠j𝑤𝑖𝑗=∑𝑝=1𝑃[𝑠𝑖(𝑝)][𝑠𝑗(𝑝)]𝑓𝑜𝑟𝑖≠𝑗
Testing Algorithm
Step 1 − Initialize the weights, which are obtained from training algorithm by using
Hebbian principle.
Step 2 − Perform steps 3-9, if the activations of the network is not consolidated.
Step 3 − For each input vector X, perform steps 4-8.
Step 4 − Make initial activation of the network equal to the external input
vector X as follows −
yi=xifori=1ton𝑦𝑖=𝑥𝑖𝑓𝑜𝑟𝑖=1𝑡𝑜𝑛
Step 5 − For each unit Yi, perform steps 6-9.
Step 6 − Calculate the net input of the network as follows −
yini=xi+∑jyjwji𝑦𝑖𝑛𝑖=𝑥𝑖+∑𝑗𝑦𝑗𝑤𝑗𝑖
Step 7 − Apply the activation as follows over the net input to calculate the output −
yi=⎧⎩⎨1yi0ifyini>θiifyini=θiifyini<θi𝑦𝑖={1𝑖𝑓𝑦𝑖𝑛𝑖>𝜃𝑖𝑦𝑖𝑖𝑓𝑦𝑖𝑛𝑖=𝜃𝑖0𝑖𝑓𝑦𝑖𝑛𝑖<𝜃𝑖
Here θi𝜃𝑖 is the threshold.
Step 8 − Broadcast this output yi to all other units.
Step 9 − Test the network for conjunction.
3. Explain supervised and unsupervised learning. Write and explain Multilayer Perceptron Model.
What is Supervised learning?
Supervised learning is a type of machine learning algorithm that learns from labeled data. Labeled
data is data that has been tagged with a correct answer or classification.
Key Points:
• Supervised learning involves training a machine from labeled data.
• Labeled data consists of examples with the correct answer or classification.
• The machine learns the relationship between inputs (fruit images) and outputs (fruit labels).
• The trained machine can then make predictions on new, unlabeled data.
Types of Supervised Learning
Supervised learning is classified into two categories of algorithms:
• Regression: A regression problem is when the output variable is a real value, such as
“dollars” or “weight”.
• Classification: A classification problem is when the output variable is a category, such as
• “Red” or “blue” , “disease” or “no disease”.
1- Regression
Regression is a type of supervised learning that is used to predict continuous values, such as house
prices, stock prices, or customer churn. Regression algorithms learn a function that maps from the
input features to the output value.
Some common regression algorithms include:
• Linear Regression
• Polynomial Regression
• Support Vector Machine Regression
• Decision Tree Regression
• Random Forest Regression
2- Classification
Classification is a type of supervised learning that is used to predict categorical values, such as
whether a customer will churn or not, whether an email is spam or not, or whether a medical image
shows a tumor or not. Classification algorithms learn a function that maps from the input features to
a probability distribution over the output classes.
Some common classification algorithms include:
• Logistic Regression
• Support Vector Machines
• Decision Trees
• Random Forests
• Naive Baye
Applications of Supervised learning
Supervised learning can be used to solve a wide variety of problems, including:
• Spam filtering: Supervised learning algorithms can be trained to identify and classify spam
emails based on their content, helping users avoid unwanted messages.
• Image classification: Supervised learning can automatically classify images into different
categories, such as animals, objects, or scenes, facilitating tasks like image search, content
moderation, and image-based product recommendations.
• Medical diagnosis: Supervised learning can assist in medical diagnosis by analyzing patient
data, such as medical images, test results, and patient history, to identify patterns that
suggest specific diseases or conditions.
• Fraud detection: Supervised learning models can analyze financial transactions and identify
patterns that indicate fraudulent activity, helping financial institutions prevent fraud and
protect their customers.
Advantages of Supervised learning

• Supervised learning allows collecting data and produces data output from previous
experiences.
• Helps to optimize performance criteria with the help of experience.
• Supervised machine learning helps to solve various types of real-world computation

problems.
• It performs classification and regression tasks.
• It allows estimating or mapping the result to a new sample.
• We have complete control over choosing the number of classes we want in the training data.
Disadvantages of Supervised learning
• Classifying big data can be challenging.
• Training for supervised learning needs a lot of computation time. So, it requires a lot of time.
• Supervised learning cannot handle all complex tasks in Machine Learning.
• Computation time is vast for supervised learning.
• It requires a labelled data set.
• It requires a training process.
What is Unsupervised learning?
Unsupervised learning is a type of machine learning that learns from unlabeled data. This means that
the data does not have any pre-existing labels or categories. The goal of unsupervised learning is to
discover patterns and relationships in the data without any explicit guidance.
Types of Unsupervised Learning
Unsupervised learning is classified into two categories of algorithms:
• Clustering: A clustering problem is where you want to discover the inherent groupings in the
data, such as grouping customers by purchasing behavior.
• Association: An association rule learning problem is where you want to discover rules that
describe large portions of your data, such as people that buy X also tend to buy Y.
Clustering
Clustering is a type of unsupervised learning that is used to group similar data points
together. Clustering algorithms work by iteratively moving data points closer to their cluster centers
and further away from data points in other clusters.
1. Exclusive (partitioning)
2. Agglomerative
3. Overlapping
4. Probabilistic
Clustering Types:-
1. Hierarchical clustering
2. K-means clustering
3. Principal Component Analysis
4. Singular Value Decomposition
5. Independent Component Analysis
Association rule learning
Association rule learning is a type of unsupervised learning that is used to identify patterns in a
data. Association rule learning algorithms work by finding relationships between different items in a
dataset.
Some common association rule learning algorithms include:
• Apriori Algorithm
• Eclat Algorithm
• FP-Growth Algorithm
Advantages of Unsupervised learning
• It does not require training data to be labeled.
• Dimensionality reduction can be easily accomplished using unsupervised learning.
• Capable of finding previously unknown patterns in data.
• Unsupervised learning can help you gain insights from unlabeled data that you might not
have been able to get otherwise.
• Unsupervised learning is good at finding patterns and relationships in data without being
told what to look for. This can help you learn new things about your data.
What is a Multilayer Perceptron Neural Network?
A multilayer perceptron (MLP) Neural network belongs to the feedforward neural network. It is an
Artificial Neural Network in which all nodes are interconnected with nodes of different layers.
The Multilayer Perceptron (MLP) Neural Network works only in the forward direction. All nodes are
fully connected to the network. Each node passes its value to the coming node only in the forward
direction. The MLP neural network uses a Backpropagation algorithm to increase the accuracy of the
training model.
Structure of MultiLayer Perceptron Neural Network
This network has three main layers that combine to form a complete Artificial Neural Network. These
layers are as follows:
Input Layer
It is the initial or starting layer of the Multilayer perceptron. It takes input from the training data set
and forwards it to the hidden layer. There are n input nodes in the input layer. The number of input
nodes depends on the number of dataset features. Each input vector variable is distributed to each
of the nodes of the hidden layer.
Must Explore – Data Science Courses
Hidden Layer
It is the heart of all Artificial neural networks. This layer comprises all computations of the neural
network. The edges of the hidden layer have weights multiplied by the node values. This layer uses
the activation function.
There can be one or two hidden layers in the model.
Several hidden layer nodes should be accurate as few nodes in the hidden layer make the model
unable to work efficiently with complex data. More nodes will result in an overfitting problem.
Output Layer
This layer gives the estimated output of the Neural Network. The number of nodes in the output
layer depends on the type of problem. For a single targeted variable, use one node. N classification
problem, ANN uses N nodes in the output lay

Unit 4

Uploaded by

Copyright:

Available Formats

Unit 4

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit 4

Uploaded by

Copyright:

Available Formats

Unit1

Types Of Learning Rules in ANN

Types Of Learning Rules in ANN

Hebbian Learning Rule

• Δw is the change in weight

• α is the learning rate

• xi is the input vector and,

• η is the learning rate

• x is the input vector

• y is the actual label of the input vector

• ŷ is the predicted label of the input vector

Correlation Learning Rule

• Δw is the change in weight

• α is the learning rate

• xi is the input vector and,

(b) What do you mean by Perceptron ? Explain Perceptron algorithm.

Basic Components of Perceptron

• A multi-layered perceptron model can solve complex non-linear problems.

• It works well with both small and large input data.

• Helps us to obtain quick predictions after the training.

• In multi-layered perceptron model, computations are time-consuming and complex.

• The model functioning depends on the quality of training.

Gradient Descent in Machine Learning

How does Gradient Descent work?

Types of Gradient Descent

1. Batch Gradient Descent:

Advantages of Batch gradient descent:

o It produces less noise in comparison to other gradient descent.

o It produces stable gradient descent convergence.

2. Stochastic gradient descent

Advantages of Stochastic gradient descent:

o It is easier to allocate in desired memory.

o It is relatively fast to compute than batch gradient descent.

o It is more efficient for large datasets.

3. MiniBatch Gradient Descent:

Advantages of Mini Batch gradient descent:

o It is easier to fit in allocated memory.

• In machine learning, backpropagation is an effective algorithm used to train artificial neural

• Backpropagation is an iterative algorithm, that helps to minimize the cost function by

Working of Backpropagation Algorithm

The Backpropagation algorithm works by two different passes, they are:

How does Forward pass work?

• Ability to learn irrespective of the type of data (Linear or Non-Linear).

Some disadvantages of ANN :

• This network is dependent on hardware.

Some advantages of BNN :

• The synapses are the input processing element.

• It is able to process highly complex parallel inputs.

• There is no controlling mechanism.

• Speed of processing is slow being it is complex.

Differences between ANN and BNN :

Parameters ANN BNN

hidden layer cell body

Processor high speed low speed

one or a few large number

Memory localized distributed

non-content addressable content-addressable

Computing sequential parallel

stored programs self-learning