Experiment 2.5 DL
Experiment 2.5 DL
Experiment 2.5 DL
Aim: Design and Implement Fully connected neural network with atleast 2 hidden layers
for classification application. Use appropriate learning Algorithm, output function and
Loss function.
Theory:
A fully connected neural network, also known as a multilayer perceptron (MLP), is a type of
neural network where every neuron in one layer is connected to every neuron in the next
layer. This is in contrast to convolutional neural networks (CNNs), where neurons are only
connected to a small region of the input data.
Activation Functions
Activation functions are used to introduce non-linearity into the neural network. In the
implementation, we used the following activation functions:
• ReLU (Rectified Linear Unit): f(x) = max(0, x) is a widely used activation function
that outputs 0 for negative inputs and the input value for positive inputs. It is
computationally efficient and easy to compute.
• Softmax: f(x) = exp(x) / sum(exp(x)) is a activation function used in the output layer
to output a probability distribution over the classes. It ensures that the output values
are between 0 and 1 and sum up to 1.
Loss Functions
The loss function measures the difference between the predicted output and the true output.
In the implementation, we used the categorical cross-entropy loss function, which is
defined as:
where y is the true label and y_pred is the predicted probability distribution.
Optimization Algorithms
The optimization algorithm is used to minimize the loss function. In the implementation, we
used the Adam optimization algorithm, which is a stochastic gradient descent algorithm
that adapts the learning rate for each parameter based on the magnitude of the gradient.
Backpropagation
Backpropagation is an algorithm used to compute the gradients of the loss function with
respect to the model parameters. It is used to update the model parameters during training.
Gradient Descent
Gradient descent is an optimization algorithm that updates the model parameters in the
direction of the negative gradient of the loss function. The update rule is:
where w_old is the current value of the parameter, learning rate is a hyperparameter that
controls the step size, and gradient is the gradient of the loss function with respect to the
parameter.
• Input Layer: The input layer takes in the input data and passes it to the first hidden
layer.
• Hidden Layers: The hidden layers are where the complex representations of the input
data are built. Each hidden layer consists of a set of neurons that apply an activation
function to the output of the previous layer.
• Output Layer: The output layer takes the output of the last hidden layer and produces
the final output of the network.
The MNIST dataset is a widely used dataset for handwritten digit recognition. It consists of
70,000 images of handwritten digits (0-9) with a size of 28x28 pixels. The dataset is divided
into a training set of 60,000 images and a test set of 10,000 images.Classification is a type of
supervised learning problem where the goal is to predict a categorical label given an input. In
the implementation, we used a softmax output layer to output a probability distribution over
the 10 classes (0-9).
Code:
import tensorflow as tf
Output:
Conclusion:
We designed a fully connected neural network with two hidden layers of 128 and 64 neurons,
respectively, using ReLU as the activation function. The output layer uses the Softmax
activation function to predict the probabilities of the 10 classes.