CV Lab 7
CV Lab 7
CV Lab 7
OBJECTIVES
The idea of machine learning is to build systems that given an input are able to predict the
correct output with respect to a certain predefined context. For example, we want to build an
image classification system, which is capable of telling the users what the given input image is
of.
Figure 1: A neural network takes in an image and returns what that image is of
Neural networks are made up of several neurons connected to each other. Neural networks are
made up of programmatically designed neurons that are connected to each other. We call the
programmatically designed neurons as perceptrons.
Perceptrons form the atomic unit of neural networks. They take as input a set of numbers (one or
many), multiply the input with some weights, and return an output.
Figure 2: A perceptron
1
DEPARTMENT OF INFORMATION TECHNOLOGY, QUEST NAWABSHAH
Computer Vision, 21 BS(AI) BATCH
In Figure 2, a perceptron takes input as three values and returns an output. The most usual
operation in each input is multiplied by a weight and the normalized sum of all products is
returned. Depending on the application and the data, the perceptron can be programmed to
perform other operations. For example, after the sum of the products is calculated, the perceptron
returns a 1 or a 0 if the sum is over a certain threshold.
Here, each of the individual circles represents a perceptron. Each perceptron takes in a set of
inputs and returns an output to the next layer of perceptron. What is in Figure 3 is just one way
of creating a neural network. Each perceptron in a particular layer is connected to all the
perceptrons in the previous layer. This is called a fully connected neural network.
The first layer of a neural network is usually called the input layer and similarly the last layer of
the neural network is called the output layer. All other layers between them are known as the
hidden layers. An input layer and an output layer is a must in a network but the number of hidden
layers can vary from zero to as many as you want. The size of the input layer is decided by the
size of the images. If an image size is 28 x 28, then the size of the input layer will be 784 (28 *
28). The size of hidden layer is decided by the user. Finally, the size of the output layer depends
on the number of labels we have.
Training a Network
After network design, we can start training the network. The training phase of any neural
network comprises two parts: first, feed forward the input and second, backpropagate the error.
Feed forward means to take the input and pass it through the perceptrons in our network and
calculate the output using the perceptrons. The input values are multiplied with the weights
values of the perceptrons and an output is generated.
While backpropagating, we take the feed forward output (from the last step) and find its
difference from the actual output (ground truth). Using this error, we modify the weights of the
perceptron.
Example 1:
2
DEPARTMENT OF INFORMATION TECHNOLOGY, QUEST NAWABSHAH
Computer Vision, 21 BS(AI) BATCH
print('Getting MNIST Data...')
#mnist = fetch_mldata('MNIST original')
images, labels = fetch_openml('mnist_784', version=1, return_X_y=True, as_frame=False)
print(images.shape, labels.shape)
print('MNIST Data downloaded!')
Exercise:
import cv2
import numpy as np
from sklearn.neural_network import MLPClassifier
from sklearn.preprocessing import normalize
from sklearn.model_selection import train_test_split
3
DEPARTMENT OF INFORMATION TECHNOLOGY, QUEST NAWABSHAH
Computer Vision, 21 BS(AI) BATCH
y.append(1)
x_test = np.array(images_test)
y_test = np.array(labels_test)
y_test = y_test.reshape(-1, 1)
print(x_train.shape)
nsamples, nx, ny,c = x_train.shape
x_train = x_train.reshape((nsamples,nx*ny*c))
# x_train_flatten = x_train.reshape(x_train.shape[0], -1).T
x_train = x_train / 255