Final Documentation of Final Project-converted
Final Documentation of Final Project-converted
A PROJECT REPORT
ON
“FACIAL EXPRESSION RECOGNITION USING CNN”
SUBMITTED TO
JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY,
ANANTAPUR
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR
THE AWARD OF THE DEGREE OF
BACHELOR OF TECHNOLOGY
IN
ELECTRONICS AND COMMUNICATION ENGINEERING
1
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
CERTIFICATE
2
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
ACKNOWLEDGEMENT
We express our heartfelt thanks to our parents and family members, who
gave moral support in completion of the course
3
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
DECLARATION
Dr. A Senthil Kumar, B.E, M.E, M.B.A, PGDVLSI, DISM, PDF (TUT, SA),
Senior PDF (VSB TUO, EUROPE), PGP AI & ML (University of Texas, Austin)
PRINCIPAL Sanskrithi School of Engineering.
I further declare that this project report has not been previously submitted before
either in part or full for the award of any degree or any diploma by any organization or any
universities.
4
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
ABSTRACT
5
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
QUOTATION
“Education is not the learning of facts, but the training of the
mind to think”
-
- Albert Einstein
6
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
CHAPTER - 1
7
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
1. INTRODUCTION
1.1 Natural Intelligence:
It is the intelligence created by nature, natural evolutionary mechanisms, as biological
intelligence embodied as the brain, animal and human and any hypothetical alien intelligence.
The key problem is “Will It Ever Be Possible to Understand the Human Brain”?
Scientists still have no reliable model of how the brain actually works, and
“neuroscience is still in its infancy,” capable of assembling a multitude of facts but struggling
to determine the relationship between them.
‘Neuroscience does not have, as physics does, a standard model that serves as a
conceptual structure in which gaps of knowledge and inconsistencies can be isolated and
serve as impetus for experiments, technological improvements or elaborate calculations. In
this workshop the speakers are asked to present embryos of brain theories that could develop
into a “standard model of the functions of the mammalian brain”.
• Recognition
• Motion Analysis
• Scene Reconstruction
• Image Restoration
8
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
Neural networks can help computers make intelligent decisions with limited human
assistance. This is because they can learn and model the relationships between input and
output data that are nonlinear and complex.
10
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
CHAPTER – 2
11
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
2.1.2 Objectives
The objectives of the project are:
1. Identifying of the facial expressions using image processing algorithms.
2. Classification of the facial expressions using Convolutional Neural Networks.
3. To get the output from the video frames/images or live recognition of expressions
12
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
PAPERS REFERRED:
2.2.1 TITLE: Facial expression detection by combining deep learning neural
networks
2.2.2 TITLE: A brief review of facial emotion recognition based on visual information
AUTHORS: Byoung Chul Ko, Ekmann and Friesen
YEAR: 2018
Face and facial components detection, features extraction and expressions of a human
begin by using conventional FER including SVM Adaboost and random forest algorithms
and used CNN for visual information. It reduces the dependence on face physics based
models. CNN based FER methods cannot reflect the temporal variations in the facial
components. By enabling the pre-processing technique “end to end” learning in the pipeline
directly from the input image.
13
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
2.2.4 TITLE: Facial emotion recognition using deep learning: review and insights
AUTHORS: Mehrabian, Ekman, Freisen, Agrwal et mittal, Deepak Jain, Mohammad pour
Year: 2020
Code facial expressions and extract these features in order to have a better prediction
by computer. The architecture and the data base used and we present the progress made by
comparing proposed methods. CNN and CNN-LSTM are exploited to achieve better
performances. Verbal and non-verbal information captured by various sensors. There are
always limited by learning only the six basics emotion plus neutral and emotions that are
more complex.
14
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
CHAPTER-3
15
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
3. SOFTWARE REQUIREMENTS
PLATFORM : Anaconda Navigator
TOOLS : Jupyter Notebook, Spyder
As the project is developed in python, we have used Anaconda for Python 3.6.5 and
Spyder.
3.1 Anaconda:
It is a free and open source distribution of the Python and R programming languages
for data science and machine learning related applications (large-scale data processing,
predictive analytics, scientific computing), that aims to simplify package management and
deployment. Package versions are managed by the package management system conda. The
Anaconda distribution is used by over 6 million users, and it includes more than 250 popular
data science packages suitable for Windows, Linux, and Mac OS.
3.2 Spyder:
Spyder (formerly Pydee) is an open source cross-platform integrated development
environment (IDE) for scientific programming in the Python language. Spyder integrates
NumPy, SciPy, Matplotlib and IPython, as well as other open source software. It is released
under the MIT license. Spyder is extensible with plug-ins, includes support for interactive
tools for data inspection and embeds Python-specific code quality assurance and
introspection instruments, such as Pyflakes, Pylint and Rope. It is available cross-platform
through Anaconda, on Windows with WinPython and Python (x,y), on mac OS through
MacPorts, and on major Linux distributions such as Arch Linux, Debian, Fedora, Gentoo
Linux, openSUSE and Ubuntu. Features include: o editor with syntax highlighting and
introspection for code completion o support for multiple Python consoles (including IPython)
16
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
o the ability to explore and edit variables from a GUI Available plug-ins include: o Static Code
Analysis with Pylint o Code Profiling o Conda Package Manager with Conda.
Jupyter Book is an open-source project for building books and documents from
computational material. It allows the user to construct the content in a mixture of Markdown,
an extended version of Markdown called MyST, Maths & Equations using MathJax, Jupyter
Notebooks, restructured Text, the output of running Jupyter Notebooks at build time. Multiple
output formats can be produced (currently single files, multipage HTML web pages and PDF
files).
3.4 Pandas
Pandas is an open-source library that is made mainly for working with relational or
labelled data both easily and intuitively. It provides various data structures and operations for
manipulating numerical data and time series. This library is built on top of the NumPy library.
Pandas is fast and it has high performance & productivity for users. Pandas were initially
developed by Wes McKinney in 2008 while he was working at AQR Capital Management. He
convinced the AQR to allow him to open source the Pandas. Another AQR employee, Chang
She, joined as the second major contributor to the library in 2012. Over time many versions of
pandas have been released. The latest version of the pandas is 1.4.1
3.5 Numpy
NumPy is a general-purpose array-processing package. It provides a high-performance
multidimensional array object, and tools for working with these arrays. It is the fundamental
package for scientific computing with Python. It is open-source software. It contains various
features including these important ones:
• A powerful N-dimensional array object
• Sophisticated (broadcasting) functions
• Tools for integrating C/C++ and Fortran code
• Useful linear algebra, Fourier transform, and random number capabilities
Besides its obvious scientific uses, NumPy can also be used as an efficient multi-
dimensional container of generic data. Arbitrary data-types can be defined using Numpy
which allows NumPy to seamlessly and speedily integrate with a wide variety of databases
17
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
Keras is:
• Simple -- but not simplistic. Keras reduces developer cognitive load to free you to
focus on the parts of the problem that really matter.
• Flexible -- Keras adopts the principle of progressive disclosure of complexity: simple
workflows should be quick and easy, while arbitrarily advanced workflows should
be possible via a clear path that builds upon what you've already learned.
• Powerful -- Keras provides industry-strength performance and scalability: it is used
by organizations and companies including NASA, YouTube, or Waymo.
Keras empowers engineers and researchers to take full advantage of the scalability and
cross-platform capabilities of TensorFlow 2: you can run Keras on TPU or on large clusters
of GPUs, and you can export your Keras models to run in the browser or on a mobile device.
18
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
CHAPTER 4
19
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
4. PROPOSED METHODOLOGY
BLOCK DIAGRAM
FACE FACE
VIDEO FRAME
DETECTION SEGMENTATION
IMAGE
FACIAL LAND
OUTPUT IMAGE ANALYSIS TO
MARKS DETERMINE
DETECTION COMPILATION
EMOTIONS
4.1 Description:
Deep learning is a class of machine learning algorithms suitable for extracting
various features from inputs, with varying degrees of complexity. In image processing,
most such algorithms are based on artificial neural networks, usually CNN. Depending
on the complexity of the network, it can recognize from something as trivial as numbers
or letters, to something as detailed as persons or faces. We constructed an application that
processes video inputs to determine emotions of persons appearing in the videos. Figure
schematically shows the steps of such a process. Our inputs for image processing with
the target of determining emotions are captured frames through webcam. The photos we
used were downloaded from specific FER-2013 datasets. For the video frames, a facial
detection algorithm was employed, using the Haar-Cascade Classifier available with the
OpenCV library. Coordinates of a bounding rectangle for each detected face are given as
output. As we want our algorithm to perform FER under real- world conditions, it is
likely that persons will not always be passing by our camera, so a lot of input frames may
be unnecessarily processed. We use a flag telling us if there are persons in the frame.
20
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
4.2 Implementation:
4.2.1 Face Detection:
Face detection using Haar cascades is a machine learning based approach where a
cascade function is trained with a set of input data. OpenCV already contains pre-trained
classifiers for face. In this project we will use Haar Cascade and LBP cascade classifiers for
the face detection. The implementation of the face detection is as shown below.
• The detection works only on grayscale images. So it is important to convert the color
image to grayscale
• detectMultiScale function is used to detect the faces. It takes 3 arguments — the input
image, scaleFactor and minNeighbours. scaleFactor specifies how much the image size
is reduced with each scale. minNeighbours specifies how many neighbors each
candidate rectangle should have to retain it.
• Faces contains a list of coordinates for the rectangular regions where faces were
found. We use these coordinates to draw the rectangles in our image.
21
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
• Getting Data
• Preparing data
• Image Augmentation
• Build model and train
• use the webcam for detection
We used the dataset fer-2013 which is publically available on Kaggle. it has 48*48
pixels gray-scale images of faces along with their emotion labels. Start by importing pandas
and some essential libraries and then loading the dataset as shown in the figure.
This dataset contains 3 columns, emotion, pixels and Usage. Emotion column
contains integer encoded emotions and pixels column contains pixels in the form of a string
separated by spaces, and usage tells if data is made for training or testing purpose. This
dataset contains 7 Emotions :- (0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5=Surprise,
6=Neutral).
Data is not in the right format. We need to pre-process the data. Here X_train,
X_test contains pixels, and y_test, y_train contains emotions. At this stage X_train,
X_test contains pixel’s number is in the form of a string, converting it into numbers is easy,
we just need to typecast. y_test, y_train contains 1D integer encoded labels, we need to
connect them into categorical data for efficient training. num_classes = 7 gives that we have
7 classes to classify.
22
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
We need to convert the data in the form of a 4d tensor (row_num, width, height, and
channel) for training purposes. Here 1 tells us that training data is in gray scale
form, at this stage, we have successfully pre-processed our data into X_train,
X_test, y_train, y_test.
23
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
Designing the CNN model for emotion detection using functional API. We are
creating blocks using Conv2D layer, Batch-Normalization, Max-Pooling2D, Dropout,
Flatten, and then stacking them together and at the end-use Dense Layer for output. Building
the model using functional API gives more flexibility. FER_model takes input size and
returns model for training. Now let’s define the architecture of the model.
24
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
25
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
Loading the trained model architecture and weights so that it can be used
further to make predictions.
We used Haar-cascade for the detection position of faces and after getting
position we will crop the faces.
Adding an overlay on the output frame and displaying the prediction with
confidence gives a better look.
26
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
Model Validation:
The confusion matrix gives the counts of emotion predictions and some insights to the
performance of the multi-class classification model.
27
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
The following are the figures shows the outputs of face detection techniques.
Fig18: Haar Cascade classifier Output
Fig19: LBP cascade classifier Output
Input Image:
The following figures show the output frames for the inputs which read from the webcam
directly.
The following are the figures show the output frames for the inputs which is given from
the data sets.
Conclusion:
In this project first level of implemented face detection using both the algorithm
(Haar Cascade and LBP Cascade for face detection). Based on the result of the both
algorithm, best result are verified by Haar cascade. Second level focused on face
recognition by CNN with live image and data set. Both compared and verified best result
is discussed with level of accuracy and losses.
29
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
CHAPTER 5
30
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
5. SOFTWARE ANALYSIS
5.1 Face Detection Technique
It is a true challenge to build an automated system which equals human ability to
detect faces, recognized estimates human body dimensions or body part from an image or
a video. The conceptual and intellectual challenges of such a problem, faces are non-rigid
and have a high degree of variability in size, shape, color and texture. Auto focus in
cameras, visual surveillance, traffic safety monitoring and human computer interaction.
Face reorganization will be following a pattern, which is focus on face or body. Face
detection is the step stone to the entire facial analysis algorithms, including face
alignment, face modeling, and face recognition and lots of more. Only when computers
can recognize face because computer is compute the logic and facial expiration and
match the expiration according to the facial structure. They begin to truly understand
peoples thoughts and intentions. Given an arbitrary image, the goal of face detection is to
determine whether or not there are any faces in the image and if the image is present then
it return the image location and extent of each face.
5.1.1 Haar Cascade Algorithm:
Start
No If positive
response
yes
Face is detected
No
If window size is max
Yes
Stop
31
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
Step 1 – Live video feed of the driver is being captured through the webcam which is
mounted in the vehicle.
Step 3- Each Frame/image is taken and using Viola-Jones algorithm, faces are detected &
therefore extracting the facial features.
Step 4- The faces detected are stored into a database/artificial neural network & there is a
simultaneous comparison between the faces being detected & faces already stored in the
database.
Step 5 – If a match is found & in case of abnormal behavior in the driver there is a signal sent
to the hardware part for delivering the output.
Step 6 – If match is not found, Go back to step4 and repeat the process until the video
capturing is turned off.
The first step is to collect the Haar features. A Haar feature is essentially calculations
that are performed on adjacent rectangular regions at a specific location in a detection
window. The calculation involves summing the pixel intensities in each region and calculating
32
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
the differences between the sums. Here are some examples of Haar features below. These
features can be difficult to determine for a large image. This is where integral images come
into play because the number of operations is reduced using the integral image.
Without going into too much of the mathematics behind it, integral images essentially
speed up the calculation of these Haar features. Instead of computing at every pixel, it instead
creates sub-rectangles and creates array references for each of those sub-rectangles. These are
then used to compute the Haar features.
33
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
The cascade classifier is made up of a series of stages, where each stage is a collection
of weak learners. Weak learners are trained using boosting, which allows for a highly accurate
classifier from the mean prediction of all weak learners.
Based on this prediction, the classifier either decides to indicate an object was found
(positive) or move on to the next region (negative). Stages are designed to reject negative
samples as fast as possible, because a majority of the windows do not contain anything of
interest.
It’s important to maximize a low false negative rate, because classifying an object as a
non-object will severely impair your object detection algorithm. A video below shows Haar
cascades in action. The red boxes denote “positives” from the weak learners.
Haar cascades are one of many algorithms that are currently being used for object
detection. One thing to note about Haar cascades is that it is very important to reduce the
false negative rate, so make sure to tunehyper parameters accordingly when training your
model.
The 256-bin histogram of the labels computed over an image can be used as a
texture descriptor. Each bin of histogram (LBP code) can be regarded as a micro-texton.
Local primitives which are codified by these bins include different types of curved edges,
spots, flat areas, etc. Figure2 shows some examples
The LBP operator has been extended to consider different neighbor sizes. For
example, the operator LBP4,1 uses 4 neighbors while LBP16,2 considers the 16 neighbors on
a circle of radius 2. In general, the operator LBPP,R refers to a neighborhood size of P
equally spaced pixels on a circle of radius R that form a circularly symmetric neighbor set.
LBPP,R produces 2P different output values, corresponding to the 2P different binary patterns
that can be formed by the Ppixels in the neighbor set. It has been shown that certain bins
contain more information than others. Therefore, it is possible to use only a subset of the 2P
LBPs to describe the textured images, defined these fundamental patterns as those with a
small number of bitwise transitions from 0 to 1 and vice versa. For example, 00000000 and
11111111 contain 0 transitions while 00000110 and 01111110 contain 2 transitions and so
on. Accumulating the patterns which have more than 2 transitions into a single bin yields an
LBP descriptor. The most important properties of LBP features are their tolerance against
35
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
Where i = 0, ..., L-1, j = 0, ..., M-1.The extracted feature histogram describes the local
texture and global shape of face images.
36
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
The weak classifier is designed to select the single LBP histogram bin which best
separates the positive and negative examples. Similar to, a weak classifier hj(x) consists of
a feature fj which corresponds to each LBP histogram bin, a threshold θj and a parity pj
indicating the direction of the in equality sign:
enable to detect rotated faces. Because rotated face in every 90˚ can be detected by rotating
LBP operator, only±18˚, 12˚ and 6˚ rotated face examples are added to training set. With
above training set, face detection works well; it can detect faces in images with low false
alarm rate. But it cannot detect faces in low light condition and dark skin faces. To solve
this problem, there are two approaches; one is image preprocessing and another is
enhancing training set. To estimate an illumination condition of image and enhance its
quality. But their method is computationally, it is not feasible on mobile product.
Therefore, to enable the system to detect faces in low light conditions, faces in various
illuminations and dark skin faces are also added to this training set.
Figure 30 Example face images from the training set with rotation.
Figure30. Example face images from the training set with various illumination
conditions. In this 57,134 face images and used it as a positive training set. To collect non
face patterns, this is used the “bootstrap” strategy in five iterations. First, my system extracts
200 patterns per an image from a set of false-alarm-causing image set which do not contain
faces. Because most of false alarms are come from trees, characters, handwritings and
fabrics, used these kinds of images as a false-alarm-causing image set. Some examples are
shown in Figure9. Then at the end of each training iterations, I run the face detector and
collected all those non face patterns that were wrongly classified as faces and used them for
training. And, extract negative training examples on false-alarm-causing image set again. To
get more efficient negative examples, used classifiers which were found in previous iteration
and chose negative examples which were mis - classified as a face.
38
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
An image is nothing but the 2-dimensional array. Before training an image, we need to
process the dataset. By processing the dataset, we mean converting each image in to NumPy
array. Each row represents an image. NumPy package is inbuilt function. Datasets is
completely ready to be trained by the model.
Neural networks are like layers. Each layer of neural network contains nodes which
calculates some values based on characteristics or weights. Activation function are Relu for
hidden layers and either sigmoid or SoftMax for output layers.
Pooling:
Max Pooling operation involves sliding a 2- dimensional filter over each channel of
features map and extract maximum features from image. Pooling layer used to reduce the
dimension of feature map. It reduces the number of parameters to learn and amount of
computation to perform. Pooling layer summarises the feature present in a region of the
feature map generated by the convolution layer.
39
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
Average Pooling is a pooling operation that calculates the average value for patches of
a feature map, and uses it to create a down sampled (pooled) feature map. It is usually used
after a Convolutional layer. It adds a small amount of translation invariance - meaning
translating the image by a small amount does not significantly affect the values of most
pooled outputs. It extracts features more smoothly than Max Pooling, whereas max pooling
extracts more pronounced features like edges.
Flattening:
Flattening operation is performed when we got multidimensional output and we want
to convert in to a single long continuous linear vector. The flattened matrix is fed as input to
the fully connected layer.
40
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
Start
Convolutional Layers
End
41
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
Choose a dataset of your interest or you can also create your own image dataset for solving
your own image classification problem. An easy place to choose a dataset is on kaggle.com.
Preparing our dataset for training will involve assigning paths and creating categories(labels),
resizing our images.
Training is an array that will contain image pixel values and the index at which the image in
the CATEGORIES list.
This shape of both the lists will be used in Classification using the NEURAL NETWORKS.
42
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
CHAPTER - 6
43
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
6. SOURCE CODE
import numpy as np
import pandas as pd
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
for filename in filenames:
print(os.path.join(dirname, filename))
import matplotlib.pyplot as plt
import scipy
df = pd.read_csv('fer2013.csv')
df.head()
df.info()
num_classes = 7
width = 48
height = 48
emotion_labels = ["Angry", "Disgust", "Fear", "Happy", "Sad", "Surprise", "Neutral"]
classes=np.array(("Angry", "Disgust", "Fear", "Happy", "Sad", "Surprise", "Neutral"))
df.Usage.value_counts()
k = np.array(list(map(int,df.iloc[0,1].split(" "))),dtype='uint8').reshape((48,48))
k.shape
X_train = []
y_train = []
X_test = []
y_test = []
for index, row in df.iterrows():
k = row['pixels'].split(" ")
if row['Usage'] == 'Training':
X_train.append(np.array(k))
y_train.append(row['emotion'])
elif row['Usage'] == 'PublicTest':
X_test.append(np.array(k))
44
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
y_test.append(row['emotion'])
X_train[0]
plt.imshow(np.array(X_train[0], dtype = 'uint8').reshape(48,48,1), cmap = 'gray')
X_train = np.array(X_train, dtype = 'uint8')
y_train = np.array(y_train, dtype = 'uint8')
X_test = np.array(X_test, dtype = 'uint8')
y_test = np.array(y_test, dtype = 'uint8')
X_train = X_train.reshape(X_train.shape[0], 48, 48, 1)
X_test = X_test.reshape(X_test.shape[0], 48, 48, 1)
X_train.shape
import keras
from tensorflow.keras.utils import to_categorical
y_train= to_categorical(y_train, num_classes=7)
y_test = to_categorical(y_test, num_classes=7)
from keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator(
rescale=1./255,
rotation_range = 10,
horizontal_flip = True,
width_shift_range=0.1,
height_shift_range=0.1,
fill_mode = 'nearest')
testgen = ImageDataGenerator(
rescale=1./255
)
datagen.fit(X_train)
batch_size = 64
train_flow = datagen.flow(X_train, y_train, batch_size=batch_size)
test_flow = testgen.flow(X_test, y_test, batch_size=batch_size)
for X_batch, y_batch in datagen.flow(X_train, y_train, batch_size=9):
45
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
46
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
47
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
conv4_2 = BatchNormalization()(conv4_2)
conv4_3 = Conv2D(256, kernel_size=3, activation='relu', padding='same', name =
'conv4_3')(conv4_2)
conv4_3 = BatchNormalization()(conv4_3)
conv4_4 = Conv2D(256, kernel_size=3, activation='relu', padding='same', name =
'conv4_4')(conv4_3)
conv4_4 = BatchNormalization()(conv4_4)
pool4_1 = MaxPooling2D(pool_size=(2,2), name = 'pool4_1')(conv4_4)
drop4_1 = Dropout(0.3, name = 'drop4_1')(pool4_1)
return model
model = FER_Model()
48
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
num_epochs = 50
history = model.fit(train_flow,
steps_per_epoch=len(X_train) / batch_size,
epochs=num_epochs,
verbose=2,
callbacks=callbacks_list,
validation_data=test_flow,
validation_steps=len(X_test) / batch_size)
train_loss=history.history['loss']
val_loss=history.history['val_loss']
train_acc=history.history['accuracy']
val_acc=history.history['val_accuracy']
epochs = range(len(train_acc))
plt.plot(epochs,train_loss,'r', label='train_loss')
plt.plot(epochs,val_loss,'b', label='val_loss')
plt.title('train_loss vs val_loss')
plt.xlabel('epoch')
plt.ylabel('loss')
plt.legend()
plt.figure()
49
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
plt.plot(epochs,train_acc,'r', label='train_acc')
plt.plot(epochs,val_acc,'b', label='val_acc')
plt.title('train_acc vs val_acc')
plt.xlabel('epoch')
plt.ylabel('accuracy')
plt.legend()
plt.figure()
model.save('./working/Fer2013.h5')
loss = model.evaluate(X_test/255., y_test)
print("Test Loss " + str(loss[0]))
print("Test Acc: " + str(loss[1]))
import itertools
def plot_confusion_matrix(y_test, y_pred, classes,
normalize=False,
title='Unnormalized confusion matrix',
cmap=plt.cm.Blues):
cm = confusion_matrix(y_test, y_pred)
if normalize:
cm = np.round(cm.astype('float') / cm.sum(axis=1)[:, np.newaxis], 2)
np.set_printoptions(precision=2)
50
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
plt.tight_layout()
plt.ylabel('True expression')
plt.xlabel('Predicted expression')
plt.show()
y_pred_ = model.predict(X_test/255., verbose=1)
y_pred = np.argmax(y_pred_, axis=1)
t_te = np.argmax(y_test, axis=1)
fig = plot_confusion_matrix(y_test=t_te, y_pred=y_pred,
classes=classes,
normalize=True,
cmap=plt.cm.Greys,
title='Average accuracy: ' + str(np.sum(y_pred == t_te)/len(t_te)) + '\n')
model_json = model.to_json()
with open("model.json", "w") as json_file:
json_file.write(model_json)
model.save_weights("model.h5")
print("Saved model to disk")
51
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
model.load_weights('model.h5')
# model = load_model('static\Fer2013.h5')
face_haar_cascade = cv2.CascadeClassifier('C:/Users/Sai Gopesh/.spyder-
py3/haarcascade_frontalface_default.xml')
cap=cv2.VideoCapture(0)
while cap.isOpened():
res,frame=cap.read()
53
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
model.load_weights('model.h5')
face_haar_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
frame=cv2.imread("expr2.jpg")
54
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
emotion_prediction = emotion_detection[max_index]
cv2.putText(res, "Sentiment: {}".format(emotion_prediction), (0,textY+22+5),
FONT,0.7, lable_color,2)
lable_violation = 'Confidence: {}'.format(str(np.round(np.max(predictions[0])*100,1))+
"%")
violation_text_dimension =
cv2.getTextSize(lable_violation,FONT,FONT_SCALE,FONT_THICKNESS )[0]
violation_x_axis = int(res.shape[1]- violation_text_dimension[0])
cv2.putText(res, lable_violation, (violation_x_axis,textY+22+5), FONT,0.7,
lable_color,2)
except :
pass
frame[0:int(height/6),0:int(width)] =res
cv2.imshow('frame', frame)
cv2.waitKey(0)
55
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
CHAPTER – 7
56
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
7. RESULTS:
57
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
58
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
59
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
60
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
61
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
62
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
CHAPTER - 8
63
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
In this project we have done face detections using the Haar cascade classifiers and
LBP cascade classifiers. Based on the results of face detection algorithms, we have chosen
Haar Cascade classifier for the face detection in this project. After the face detection, we have
worked for face recognition. For the face recognition we have used Convolutional Neural
Networks algorithm. Data sets are FER-2013 were used in this process of face recognition.
We have given the inputs as input images and using web cam feed. We compared the results.
Inputs given through input images have more accuracy and recognizing the expressions
accurately than using web cam. The achieved accuracy while training was 0.75 for 50 epoch.
The relatively lower amount of data for emotions such as”disgust” makes the model have
difficulty predicting it. To improve the prediction of the expressions “disgust” leads to the
future work. Further training samples for the more difficult to predict emotion of disgust will
definitely be required in order to perfect such a system. This also implies that with some
work, the model could very well be deployed into real-life applications for effective
utilization in domains such as in healthcare, marketing and the video game industry.
64
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
CHAPTER – 9
65
Sanskrithi School of Engineering
Facial Expression Recognition using CNN
REFERENCES:
• Facial expression detection by combining deep learning neural networks The 12th
international symposium on advanced topics in electrical engineering march 25-27,
2021 bucharest, romania
• B.C. Ko, “A brief review of facial emotion recognition based on visual information”,
Sensointernational symposium on advanced topics in electrical engineering march 25-
27, 2021 bucharest, romaniars, vol. 18, no. 2, 401, Jan. 2018.
• D. Mehta, M.F.H. Siddiqui, and A.Y. Javaid, “Recognition of emotion intensities
using machine learning algorithms: A comparative study”, Sensors, vol. 19, no. 8,
1897, Apr. 2019.
• Facial emotion recognition using deep learning: review and insights from The 2nd
International Workshop on the Future of Internet of Everything (FIoE) August 9-12,
2020, Leuven, Belgium
• A. De Souza, A. Lopes, E. Aguiar, and T. Oliveira-Santos, “Facial expression
recognition with convolutional neural networks: coping with few data and the training
sample order”, Pattern recognition, vol. 61, pp. 610-628, Jan. 2016.
• Facial Emotion Recognition Using Deep Cnn Based Features by Jyostna Devi
Bodapati, N. Veeranjaneyulu
• Facial Emotion Recognition using Convolutional Neural Networks by Akash
Saravanan, Gurudutt Perichetla, Dr. K.S.Gayathri
• https://www.geeksforgeeks.org/opencv-python-tutorial/
• https://www.analyticsvidhya.com/blog/2021/11/facial-emotion-detection-using-cnn/
• https://en.wikipedia.org/wiki/CNN
• https://keras.io/about/
• https://opencv.org/about/
66
Sanskrithi School of Engineering