B.Tech Project Mid Term Report: Handwritten Digits Recognition Using Neural Networks
B.Tech Project Mid Term Report: Handwritten Digits Recognition Using Neural Networks
B.Tech Project Mid Term Report: Handwritten Digits Recognition Using Neural Networks
TECH PROJECT
MID TERM REPORT
HANDWRITTEN DIGITS
RECOGNITION USING NEURAL
NETWORKS
1.Introduction
2.Machine Learning
2.1 Supervised Learning
2.2 Unsupervised Learning
2.3 Linear Regression
2.4 Logistic Regression
3.Neural Networks
3.1 Introduction
3.2 Transfer Function
3.3 Perceptron
3.3.1 Implementation of Nand gates using a Perceptron
3.4 Sigmoid Neuron
3.5 Cost Function
3.6 Gradient Descent
3.7 Backpropagation Algorithm
4. Future Timeline of the project
5. References
INTRODUCTION
Although the topic of the project is not something that has never been done
before, we decided to go ahead with it and learn as much as we can in the
due course of the project. Before going into the advanced concepts of
neural networks involved in the project, we would describe the basics of
some machine learning concepts which will be involved in the project.
Machine Learning
Without having to explicitly program the computers, this field gives
computers the ability to learn. It is very similar to data mining except that it
looks to find out pattern in data while data mining focuses on extraction of
data for human requirement.
Supervised Learning:
Supervised learning is the process of an algorithm learning from the
training dataset,where you have input variables (x) and an output variable
(Y) and you use an algorithm to learn the mapping function from the input
to the output.
Y = f(X)
Unsupervised learning:
Unsupervised learning is where you only have input data (X) and no
corresponding output variables.The system itself recognize the correlation
and organize pattern into categories accordingly.The goal for unsupervised
learning is to model the structure in the data in order to learn more about
the data.
Linear Regression
A machine learning technique used to fit an equation to a dataset
y=mx
When the inputs are represented as vectors, in order to find out the ‘theta’
matrix:
Xϴ = Y.
Multiplying by X transpose on both sides and then making ϴ as the subject
of our equation, we get:
Logistic Regression:
A machine learning technique used for binary classification.Obtaining a
best fit logistic function. In linear regression, the outcome (dependent
variable) is continuous. It can have any one of an infinite number of
possible values. In logistic regression, the outcome (dependent variable)
has only a limited number of possible values.
NEURAL NETWORKS
Introduction
Transfer Function
PERCEPTRON
A perceptron takes several binary values as input and gives a single binary
value as output. Binary values are either 0 or 1.
There are 3 inputs here x1, x2 and x3 each of which will be either 0 or 1
and there is one output which too will have a binary value. There are
weights associated with each of the inputs which are important to scale up
or scale down the importance of a particular input value. These weights are
real numbers and the output value is decided by whether the weighted
sum ∑ wjxj is less than a certain threshold value. The output is 0 if ∑ wjxj is
less than or equal to the threshold value and 1 otherwise.
x1 x2 Output
0 0 1
0 1 1
1 0 1
1 1 0
Consider two inputs x1 and x2, with a threshold value of -3. Let the
assigned weight to each of the inputs be -2.
When x1 and x2 both are 0, our weighted sum is : -2*0 + -2*0 which is 0.
Since this is more than the threshold value, output is 1. When any one of
the inputs (say x1) is 1 and x2 is 0, then the weighted sum will be -2(1) +
-2(0) which is equal to -2. Since this is again greater than -3 the output will
again be 1. When both the inputs are equal to 1, then the weighted sum will
be equal to -4 which is less than our threshold value, Thus output will be 0
and we have implemented the NAND gate.
The neural network would look like this: In our
SIGMOID NEURON
The inputs here can be any real value ranging from 0 to 1. This is unlike the
perceptron where we had binary inputs. Also, the output is: σ(w.x − t).
Here, σ is known as the sigmoid function which is defined by : σ (z) = 1/(1
+ e^-z).
The sigmoid function closely resembles the perceptron. When the weighted
sum is a very huge positive quantity, e^-z will tend to 0 and the output will
be 1. When the weighted sum is a huge negative quantity, e^-z will tend to
infinity and the output will be 0. For other values, the output will be a real
number between 0 and 1.
Here is the graph of the sigmoid function, where Z is the weighted sum and
the threshold value taken together.
COST FUNCTION
In order to come with a correct output, out neural network should produce a
solution that is as close as possible to the desired solution. That is, the
difference between the actual solution and the solution provided by our
neural network should be as low as possible. To understand it
mathematically, we will define a cost function which is given by:
Here n is the total number of the training inputs, y(x) is the actual desired
output while a is the output that is produced by our neural network. Clearly,
the difference between y(x) and a should be as low as possible. We are
finding the Mean Square Error which is why the terms are squared and
this value should be as close to 0 as possible. Note that an accurate output
should give the value of the cost function as 0.
We need to find a set of weights for which the cost function is as small as
possible. We will be doing this by using an algorithm known as the
Gradient Descent.
GRADIENT DESCENT
Back-Propagation Algorithm
After having received the output in the output layer, the cost function is
computed for each of the outputs produced in each of the neurons of the
output layer. Once that is done, the error values are propagated backwards
towards the initial layers so as to train the neural network further. After that,
these error values are used to compute the gradient of the cost function
with respect to the assigned weights. Consequently, the weights are
updated so as to minimise the cost function and in the process of going
back and forth, our network becomes trained after having received and
propagated the input sets quite a number of times.
We believe that we have now got a firm grasp on the concepts that will be
employed in building our neural network. We would like to finish the
complete coding/implementation part of our project by the end of the first
week of April. If everything goes well, we intend to submit the final project
report along with the coding files, latest by 15th April. This is a tentative
timeline, however, we will try and make sure that we stick to it. We would
also take this opportunity to thank you for your guidance and supporting us
in our project.
REFERENCES