Unit 4
Unit 4
Unit 4
2. (a) Discuss Hebbian Learning rule and winner-take all learning rule.
ANN or artificial neural networks are the computing systems developed by taking inspiration from
the biological neural networks; the human brain being its major unit. These neural networks are
made functional with the help of training that follows some kind of a learning rule.
Learning rule enhances the Artificial Neural Network’s performance by applying this rule over the
network. Thus learning rule updates the weights and bias levels of a network when certain
conditions are met in the training process. it is a crucial part of the development of the Neural
Network.
Developed by Donald Hebb in 1949, Hebian learning is an unsupervised learning rule that works by
adjusting the weights between two neurons in proportion to the product of their activation.
According to this rule, the weight between two neurons is decreased if they are working in opposite
directions and vice-versa. However, if there is no correlation between the signals, then the weight
remains the same. As for the sign a weight carries, it is positive when both the nodes are either
positive or negative. However, in case of a one node being either positive or negative, the weight
carries a negative sign.
Formula
Δw = αxiy
where,
• y is the output
Perceptron Learning Rule
Developed by Rosenblatt, the perceptron learning rule is an error correction rule, used in a single
layer feed forward network. Like Hebbian learning, this also, is a supervised learning rule.
This rule works by finding the difference between the actual and desired outputs and adjusts the
weights according to that. Naturally, the rule requires the input of a set of vectors, along with the
weights so that it can produce an output.
Formula
w = w + η(y -ŷ)x
• w is the weight
With a similar principle as the Hebbian rule, the correction rule also increases or decreases the
weights based on the phases of the two neurons.
If the neurons are in the opposite phase to each other, the weight should be towards the negative
side and if they are in the same phase to each other, the weight should be towards the positive side.
The only thing that makes this rule different from the Hebbian learning rule is that it is supervised in
nature.
Formula
Δw = αxitj
Perceptron
Perceptron was introduced by Frank Rosenblatt in 1957. He proposed a Perceptron learning rule
based on the original MCP neuron. A Perceptron is an algorithm for supervised learning of binary
classifiers. This algorithm enables neurons to learn and processes elements in the training set one at
a time.
Perceptron is a type of artificial neural network, which is a fundamental concept in machine learning.
The basic components of a perceptron are:
1. Input Layer: The input layer consists of one or more input neurons, which receive input
signals from the external world or from other layers of the neural network.
2. Weights: Each input neuron is associated with a weight, which represents the strength of the
connection between the input neuron and the output neuron.
3. Bias: A bias term is added to the input layer to provide the perceptron with additional
flexibility in modeling complex patterns in the input data.
4. Activation Function: The activation function determines the output of the perceptron based
on the weighted sum of the inputs and the bias term. Common activation functions used in
perceptrons include the step function, sigmoid function, and ReLU function.
5. Output: The output of the perceptron is a single binary value, either 0 or 1, which indicates
the class or category to which the input data belongs.
Types of Perceptron:
1. Single layer: Single layer perceptron can learn only linearly separable patterns.
2. Multilayer: Multilayer perceptrons can learn about two or more layers having a greater
processing power.
The Perceptron algorithm learns the weights for the input signals in order to draw a linear decision
boundary.
Advantages:
• Helps us obtain the same accuracy ratio with big and small data.
Disadvantages:
• It is tough to predict how much the dependent variable affects each independent variable.
2-Explain various mathematical models of Neurons with their learning rules. Discuss in detail
Gradient Descent Algorithm.
Gradient Descent is known as one of the most commonly used optimization algorithms to train
machine learning models by means of minimizing errors between actual and expected results.
Further, gradient descent is also used to train Neural Networks.
The best way to define the local minimum or local maximum of a function using gradient descent is
as follows:
o If we move towards a negative gradient or away from the gradient of the function at the
current point, it will give the local minimum of that function.
o Whenever we move towards a positive gradient or towards the gradient of the function at
the current point, we will get the local maximum of that function.
This entire procedure is known as Gradient Ascent, which is also known as steepest descent. The
main objective of using a gradient descent algorithm is to minimize the cost function using
iteration. To achieve this goal, it performs two steps iteratively:
Before starting the working principle of gradient descent, we should know some basic concepts to
find out the slope of a line from linear regression. The equation for simple linear regression is given
as:
1. Y=mX+c
Where 'm' represents the slope of the line, and 'c' represents the intercepts on the y-axis.
The starting point(shown in above fig.) is used to evaluate the performance as it is considered just as
an arbitrary point. At this starting point, we will derive the first derivative or slope and then use a
tangent line to calculate the steepness of this slope. Further, this slope will inform the updates to the
parameters (weights and bias).
Batch gradient descent (BGD) is used to find the error for each point in the training set and update
the model after evaluating all training examples. This procedure is known as the training epoch. In
simple words, it is a greedy approach where we have to sum over all examples for each update.
o It is Computationally efficient as all resources are used for all training samples.
Stochastic gradient descent (SGD) is a type of gradient descent that runs one training example per
iteration. Or in other words, it processes a training epoch for each example within a dataset and
updates each training example's parameters one at a time. As it requires only one training example
at a time, hence it is easier to store in allocated memory.
In Stochastic gradient descent (SGD), learning happens on every example, and it consists of a few
advantages over other gradient descent.
Mini Batch gradient descent is the combination of both batch gradient descent and stochastic
gradient descent. It divides the training datasets into small batch sizes then performs the updates on
those batches separately. Splitting training datasets into smaller batches make a balance to maintain
the computational efficiency of batch gradient descent and speed of stochastic gradient descent.
What is backpropagation?
• Computing the gradient in the backpropagation algorithm helps to minimize the cost
function and it can be implemented by using the mathematical rule called chain rule from
calculus to navigate through complex layers of the neural network.
• Forward pass
• Backward pass
• In forward pass, initially the input is fed into the input layer. Since the inputs are raw data,
they can be used for training our neural network.
• The inputs and their corresponding weights are passed to the hidden layer. The hidden layer
performs the computation on the data it receives. If there are two hidden layers in the neural
network, for instance, consider the illustration fig(a), h1 and h2 are the two hidden layers,
and the output of h1 can be used as an input of h2. Before applying it to the activation
function, the bias is added.
How does backward pass work?
• In the backward pass process shows, the error is transmitted back to the network which
helps the network, to improve its performance by learning and adjusting the internal
weights.
• To find the error generated through the process of forward pass, we can use one of the most
commonly used methods called mean squared error which calculates the difference between
the predicted output and desired output
2. Differentiate between Biological neural network & ANN. Explain Hopfield network with
algorithm and learning rule.
1. Artificial Neural Network: Artificial Neural Network (ANN) is a type of neural network that is
based on a Feed-Forward strategy. It is called this because they pass information through the nodes
continuously till it reaches the output node. This is also known as the simplest type of neural
network. Some advantages of ANN :
• ANN is highly volatile and serves best in financial time series forecasting.
• The simplest architecture makes it difficult to explain the behavior of the network.
2. Biological Neural Network: Biological Neural Network (BNN) is a structure that consists of
Synapse, dendrites, cell body, and axon. In this neural network, the processing is carried out by
neurons. Dendrites receive signals from other neurons, Soma sums all the incoming signals and axon
transmits the signals to other cells.
Biological Neural Networks (BNNs) and Artificial Neural Networks (ANNs) are both composed of
similar basic components, but there are some differences between them.
Neurons: In both BNNs and ANNs, neurons are the basic building blocks that process and transmit
information. However, BNN neurons are more complex and diverse than ANNs. In BNNs, neurons
have multiple dendrites that receive input from multiple sources, and the axons transmit signals to
other neurons, while in ANNs, neurons are simplified and usually only have a single output.
Synapses: In both BNNs and ANNs, synapses are the points of connection between neurons, where
information is transmitted. However, in ANNs, the connections between neurons are usually fixed,
and the strength of the connections is determined by a set of weights, while in BNNs, the
connections between neurons are more flexible, and the strength of the connections can be
modified by a variety of factors, including learning and experience.
Neural Pathways: In both BNNs and ANNs, neural pathways are the connections between neurons
that allow information to be transmitted throughout the network. However, in BNNs, neural
pathways are highly complex and diverse, and the connections between neurons can be modified by
experience and learning. In ANNs, neural pathways are usually simpler and predetermined by the
architecture of the network.
input dendrites
weight synapse
Structure
output axon
Learning very precise structures and formatted data they can tolerate ambiguity
complex simple
centralized distributed
Head-to-head comparison between Artificial Neural Network and Biological Neural Network
Here, you will learn head-to-head comparisons between Artificial Neural Network and Biological
Neural Network. The main differences between Artificial Neural Network and Biological Neural
Network are as follows:
Processing Its processing was sequential and It processes the information in a parallel
centralized. and distributive manner.
Control Its control unit keeps track of all computer- All processing is managed centrally.
Mechanism related operations.
Hopfield neural network was invented by Dr. John J. Hopfield in 1982. It consists of
a single layer which contains one or more fully connected recurrent neurons. The
Hopfield network is commonly used for auto-association and optimization tasks.
Architecture
Following are some important points to keep in mind about discrete Hopfield
network −
• This model consists of neurons with one inverting and one non-inverting
output.
• The output of each neuron should be the input of other neurons but not the
input of self.
• Weight/connection strength is represented by wij.
• Connections can be excitatory as well as inhibitory. It would be excitatory, if
the output of the neuron is same as the input, otherwise inhibitory.
• Weights should be symmetrical, i.e. wij = wji
The output from Y1 going to Y2, Yi and Yn have the
weights w12, w1i and w1n respectively. Similarly, other arcs have the weights on
them.
Training Algorithm
wij=∑p=1P[2si(p)−1][2sj(p)−1]fori≠j𝑤𝑖𝑗=∑𝑝=1𝑃[2𝑠𝑖(𝑝)−1][2𝑠𝑗(𝑝)−1]𝑓𝑜𝑟𝑖≠𝑗
wij=∑p=1P[si(p)][sj(p)]fori≠j𝑤𝑖𝑗=∑𝑝=1𝑃[𝑠𝑖(𝑝)][𝑠𝑗(𝑝)]𝑓𝑜𝑟𝑖≠𝑗
Testing Algorithm
Step 1 − Initialize the weights, which are obtained from training algorithm by using
Hebbian principle.
Step 2 − Perform steps 3-9, if the activations of the network is not consolidated.
Step 4 − Make initial activation of the network equal to the external input
vector X as follows −
yi=xifori=1ton𝑦𝑖=𝑥𝑖𝑓𝑜𝑟𝑖=1𝑡𝑜𝑛
Step 5 − For each unit Yi, perform steps 6-9.
yini=xi+∑jyjwji𝑦𝑖𝑛𝑖=𝑥𝑖+∑𝑗𝑦𝑗𝑤𝑗𝑖
Step 7 − Apply the activation as follows over the net input to calculate the output −
yi=⎧⎩⎨1yi0ifyini>θiifyini=θiifyini<θi𝑦𝑖={1𝑖𝑓𝑦𝑖𝑛𝑖>𝜃𝑖𝑦𝑖𝑖𝑓𝑦𝑖𝑛𝑖=𝜃𝑖0𝑖𝑓𝑦𝑖𝑛𝑖<𝜃𝑖
3. Explain supervised and unsupervised learning. Write and explain Multilayer Perceptron Model.
Supervised learning is a type of machine learning algorithm that learns from labeled data. Labeled
data is data that has been tagged with a correct answer or classification.
Key Points:
• The machine learns the relationship between inputs (fruit images) and outputs (fruit labels).
• The trained machine can then make predictions on new, unlabeled data.
• Regression: A regression problem is when the output variable is a real value, such as
“dollars” or “weight”.
1- Regression
Regression is a type of supervised learning that is used to predict continuous values, such as house
prices, stock prices, or customer churn. Regression algorithms learn a function that maps from the
input features to the output value.
• Linear Regression
• Polynomial Regression
2- Classification
Classification is a type of supervised learning that is used to predict categorical values, such as
whether a customer will churn or not, whether an email is spam or not, or whether a medical image
shows a tumor or not. Classification algorithms learn a function that maps from the input features to
a probability distribution over the output classes.
• Logistic Regression
• Decision Trees
• Random Forests
• Naive Baye
• Spam filtering: Supervised learning algorithms can be trained to identify and classify spam
emails based on their content, helping users avoid unwanted messages.
• Image classification: Supervised learning can automatically classify images into different
categories, such as animals, objects, or scenes, facilitating tasks like image search, content
moderation, and image-based product recommendations.
• Medical diagnosis: Supervised learning can assist in medical diagnosis by analyzing patient
data, such as medical images, test results, and patient history, to identify patterns that
suggest specific diseases or conditions.
• Fraud detection: Supervised learning models can analyze financial transactions and identify
patterns that indicate fraudulent activity, helping financial institutions prevent fraud and
protect their customers.
• We have complete control over choosing the number of classes we want in the training data.
• Training for supervised learning needs a lot of computation time. So, it requires a lot of time.
Unsupervised learning is a type of machine learning that learns from unlabeled data. This means that
the data does not have any pre-existing labels or categories. The goal of unsupervised learning is to
discover patterns and relationships in the data without any explicit guidance.
• Clustering: A clustering problem is where you want to discover the inherent groupings in the
data, such as grouping customers by purchasing behavior.
• Association: An association rule learning problem is where you want to discover rules that
describe large portions of your data, such as people that buy X also tend to buy Y.
Clustering
Clustering is a type of unsupervised learning that is used to group similar data points
together. Clustering algorithms work by iteratively moving data points closer to their cluster centers
and further away from data points in other clusters.
1. Exclusive (partitioning)
2. Agglomerative
3. Overlapping
4. Probabilistic
Clustering Types:-
1. Hierarchical clustering
2. K-means clustering
Association rule learning is a type of unsupervised learning that is used to identify patterns in a
data. Association rule learning algorithms work by finding relationships between different items in a
dataset.
• Apriori Algorithm
• Eclat Algorithm
• FP-Growth Algorithm
• Unsupervised learning can help you gain insights from unlabeled data that you might not
have been able to get otherwise.
• Unsupervised learning is good at finding patterns and relationships in data without being
told what to look for. This can help you learn new things about your data.
What is a Multilayer Perceptron Neural Network?
A multilayer perceptron (MLP) Neural network belongs to the feedforward neural network. It is an
Artificial Neural Network in which all nodes are interconnected with nodes of different layers.
The Multilayer Perceptron (MLP) Neural Network works only in the forward direction. All nodes are
fully connected to the network. Each node passes its value to the coming node only in the forward
direction. The MLP neural network uses a Backpropagation algorithm to increase the accuracy of the
training model.
This network has three main layers that combine to form a complete Artificial Neural Network. These
layers are as follows:
Input Layer
It is the initial or starting layer of the Multilayer perceptron. It takes input from the training data set
and forwards it to the hidden layer. There are n input nodes in the input layer. The number of input
nodes depends on the number of dataset features. Each input vector variable is distributed to each
of the nodes of the hidden layer.
Hidden Layer
It is the heart of all Artificial neural networks. This layer comprises all computations of the neural
network. The edges of the hidden layer have weights multiplied by the node values. This layer uses
the activation function.
Several hidden layer nodes should be accurate as few nodes in the hidden layer make the model
unable to work efficiently with complex data. More nodes will result in an overfitting problem.
Output Layer
This layer gives the estimated output of the Neural Network. The number of nodes in the output
layer depends on the type of problem. For a single targeted variable, use one node. N classification
problem, ANN uses N nodes in the output lay