0% found this document useful (0 votes)

238 views44 pages

Face Mask Detection

This document provides an introduction and literature review on face mask detection. The introduction discusses the motivation for developing a face mask detector due to the COVID-19 pandemic and outlines the objectives and dataset used. The literature review then covers popular methods for object detection, including two-stage detectors like R-CNN and one-stage detectors like SSD and YOLO. It also discusses existing work on face mask detection using datasets like RMFD and a balanced custom dataset, and techniques like transfer learning with pre-trained models.

Uploaded by

Rasool Reddy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

238 views44 pages

Face Mask Detection

Uploaded by

Rasool Reddy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 44

CHAPTER 1

INTRODUCTION
1.1. Motivation
At present, the situation of the Coronavirus pandemic (Covid-19) which can infect
between human to human contacts by patients confirmed that they have never gone to epidemic
areas occurred in Wuhan in China and spread out almost all countries. According to the World
Health Organization (WHO), the number of people who have infected about 6,637,519 cases and
391,161 deaths. The COVID-19 virus created a paramount health emergency in the history of
mankind [1]. The virus could spread by the droplets from the contaminated individual [2, 3]. The
most important defense against the virus is the face mask. This is also advised by the WHO [4,
5]. One of the preventing solutions is wearing a mask to prevent the virus from spreading out in
the air to reduce opportunities to get infected of each people. In many organizations are
controlling all people who would like to get some services from them wearing a mask; however,
the number of users or customers are more than the service providers’ result in rigorous
checking.

It is not only important to wear a mask but also wear the mask in a way that covers the
nose and mouth completely. Wearing the mask inappropriately can also spread the virus and will
not provide significant protection [6]. Developing a face mask recognizer that not only detects
the mask but also the accuracy to which a person is wearing the mask can help prevent the
outburst of the virus and save many lives [7]. This face mask recognizer can be used in public
places to monitor the crowd and identify individuals who are not wearing the mask or those who
are wearing it incorrectly. This can help to spread awareness and educate people the correct way
to wear a mask. This implementation can help frontline workers focus on eradication of the virus
[8]. This face mask recognizer is a necessity and as a face mask is our shield against the virus,
developing this model was essential and presently with the scare of the new variants, this finds
high application value which motivated the idea for this study.
1.2. Objective

Changes in the lifestyle of everyone around the world. In those changes wearing a mask
has been very vital to every individual. Detection of people who are not wearing masks is a
challenge due to Outbreak of the Coronavirus pandemic has created various the large numbers of
populations. This project can be used in schools, hospitals, banks, airports, and etc. as a
digitalized scanning tool. The technique of detecting people’s faces and segregating them into
two classes namely the people with masks and people without masks is done with the help of
deep learning. With the help of this project, a person who is intended to monitor the people can
be seated in a remote area and still can monitor efficiently and give instructions accordingly. A
GitHub dataset which consisted of images with and without masks was used. For the purpose of
this study a pre-trained convolutional neural network that is AlexNet was used.

1.3. Dataset

Masks play a crucial role in protecting the health of individuals against respiratory
diseases, as is one of the few precautions available for COVID-19 in the absence of
immunization. With this dataset, it is possible to create a model to detect people wearing masks,
not wearing them, or wearing masks improperly. This dataset contains 1000 images which are
gathered from GitHub belonging to the 2 classes.
The classes are:
 With mask;
 Without mask;
CHAPTER 2

LITERATURE REVIEW

Object detection is one of the trending topics in the field of image processing and
computer vision. Ranging from small scale personal applications to large scale industrial
applications, object detection and recognition is employed in a wide range of industries. Some
examples include image retrieval, security and intelligence, OCR, medical imaging and
agricultural monitoring. In object detection, an image is read and one or more objects in that
image are categorized. The location of those objects is also specified by a boundary called the
bounding box. Traditionally, researchers used pattern recognition to predict faces based on prior
face models. A break through face detection technology then was developed named as Viola
Jones detector that was an optimized technique of using Haar [9], digital image features used in
object recognition. However, it failed because it did not perform well on faces in dark areas and
non-frontal faces. Since then, researchers are eager to develop new algorithms based on deep
learning to improve the models. Deep learning allows us to learn features with end to end
manner and removing the need to use prior knowledge for forming feature extractors. There are
various methods of object detection based on deep learning which are divided into two
categories: one stage and two stage object detectors.

Two stage detectors use two neural networks to detect objects, for instance region-based
convolutional neural networks (R-CNN) and faster R-CNN. The first neural network is used to
generate region proposals and the second one refines these region proposals; performing a
coarse-to-fine detection. This strategy results in high detection performance compromising on
speed. The seminal work R-CNN is proposed by R. Girshick et al. [10]. R-CNN uses selective
search to propose some candidate regions which may contain objects. After that, the proposals
are fed into a CNN model to extract features, and a support vector machine (SVM) is used to
recognize classes of objects. However, the second stage of R-CNN is computationally expensive
since the network has to detect proposals on a one-by-one manner and uses a separate SVM for
final classification. Fast R-CNN [11] solves this problem by introducing a region of interest
(ROI) pooling layer to input all proposal regions at once. Faster RCNN [12] is the evolution of
R-CNN and Fast R-CNN, and as the name implies its training and testing speed is greater than
those of its predecessors. While R-CNN and Fast R-CNN use selective search algorithms
limiting the detection speed, Faster R-CNN learns the proposed object regions itself using a
region proposal network (RPN).

On the other hand, a one stage detector utilizes only a single neural network for region
proposals and for detection; some primary ones being SSD (Single Shot Detection) [13] and
YOLO (You OnlyLook Once) [14]. To achieve this, the bounding boxes should be predefined.
YOLO divides the image into several cells and then matches the bounding boxes to objects for
each cell. This, however, is not good for small sized objects. Thus, multi scale detection is
introduced in SSD which can detect objects of varying sizes in an image. Later, in order to
improve detection accuracy, Lin et. al [15] proposes Retina Network (RetinaNet) by combining
an SSD and FPN (feature pyramid network) to increase detection accuracy and reduce class
imbalance. One-stage detectors have higher speed but trades off the detection performance but
then only are preferred over two-stage detectors.

Like object detection, face detection adopts the same architectures as one-stage and two-
stage detectors, but in order to improve face detection accuracy, more face-like features are being
added. However, there is occasional research focusing on face mask detection. Some already
existing facemask detectors have been modeled using OpenCV, Pytorch Lightning, MobileNet,
RetinaNet and Support Vector Machines. Here, we will be discussing two projects. One project
used Real World Masked Face Dataset (RMFD) which contains 5,000 masked faces of 525
people and 90,000 normal faces [16]. These images are 250 x 250 in dimensions and cover all
races and ethnicities and are unbalanced. This project took 100 x 100 images as input, and
therefore, transformed each sample image when querying it, by resizing it to 100x100.
Moreover, this project uses PyTorch then they convert images to Tensors, which is the base data
type that PyTorch can work with. RMFD is imbalanced (5,000 masked faces vs 90,000 non-
masked faces). Therefore, the ratio of the samples in train/validation while splitting the dataset
was kept equal using the train test split function of sklearn. Moreover, to deal with unbalanced
data, they passed this information to the loss function to avoid un proportioned step sizes of the
optimizer. They did this by assigning a weight to each class, according to its represent ability in
the dataset. They assigned more weight to classes with a small number of samples so that the
network will be penalized more if it makes mistakes predicting the label of these classes. While
classes with large numbers of samples, they assigned to them a smaller weight. This makes their
network training agnostic to the proportion of classes.

In the second project [17], a dataset was created by Prajna Bhandary using a PyImage
Search reader. This dataset consists of 1,376 images belonging to all races and is balanced. There
are 690 images with masks and 686 without masks. Firstly, it took normal images of faces and
then created a customized computer vision Python script to add face masks to them. Thereby, it
created a real-world applicable artificial dataset. This method used the facial landmarks which
allow them to detect the different parts of the faces such as eyes, eyebrows, nose, mouth, jaw line
etc. To use the facial landmarks, it takes a picture of a person who is not wearing a mask, and,
then, it detects the portion of that person’s face. After knowing the location of the face in the
image, it extracted the face Region of Interest (ROI). After localizing facial landmarks, a picture
of a mask is placed into the face. In this project, embedded devices are used for deployment that
could reduce the cost of manufacturing. MobileNetV2 architecture is used as it is a highly
efficient architecture to apply on embedded devices with limited computational capacity such as
Google Coral, NVIDIA Jetson Nano. This project performed well, however, if a large portion of
the face is occluded by the mask, this model could not detect whether a person is wearing a mask
or not. The dataset used to train the face detector did not have images of people wearing face
masks as a result, if the large portion off aces is occluded, the face detector would probably fail
to detect properly. To get rid of this problem, they should gather actual images of people wearing
masks rather than artificially generated images.
Initially researchers focused on edge and gray value of face image. In [18] was based on
pattern recognition model, having a prior information of the face model. Adaboost [19] was a
good training classifier. The face detection technology got a breakthrough with the famous Viola
Jones Detector [20], which greatly improved real time face detection. Viola Jones detector
optimized the features of Haar [21], but failed to tackle the real world problems and was
influenced by various factors like face brightness and face orientation. Viola Jones could only
detect frontal well lit faces. It failed to work well in dark conditions and with non-frontal images.
These issues have made the independent researchers work on developing new face detection
models based on deep learning, to have better results for the different facial conditions. We have
developed our face detection model using Multi Human Parsing Dataset [22], based on fully
convolutional networks, such that it can detect the face in any geometric condition frontal or
non-frontal for that matter.

Convolutional Networks have always been used for image classification tasks. Typical
architectures like AlexNet [23] and VGGNet [24] comprise of stacked convolutional layers.
AlexNet with 5 convolutional layers and 3 fully connected layers has been the winner of
ImageNet LSVRC-2012 competition while VGGNet is an improvement over AlexNet as it
replaces large kernels with 3x3 multiple kernels consecutively. The ILSVRC-2014 winning
architecture GoogleNet [25] uses parallel convolution kernels and concatenating the feature
maps together. In it 1×1, 3×3 and 5×5 convolutions and 3×3 max-pooling have been used.
Smaller convolutions extract the local features whereas larger convolutions extract high level
features. More recent architectures such as ResNet [26] have introduced skip connections which
allows deeper networks to avoid saturation in training accuracy. These architectures are often
used for initial feature extraction in face detection networks. In our method, we are using VGG
16 architecture as the base network for face detection and Fully Convolutional Network for
segmentation. VGG 16 network is sufficiently deep to extract features and computationally less
expensive for our case. Though majority of segmentation architectures rely on down sampling
and consecutive upsampling of input image, Fully Convolutional Networks [27], [28], [29] still
are modest and have significantly accurate approach for segmentation.

In [30], the authors developed a face mask wearing condition identification method. They
were ready to classify three categories of face mask-wearing. The categories are face mask-
wearing, incorrect face mask-wearing and no face mask-wearing. Saber et al [31], have applied
the principal component analysis on masked and unmasked face recognition to acknowledge the
person. Also, PCA was utilized in [32]. The author proposed a way that’s used for removing
glasses from human frontal faces. In [33], the authors used theYOLOv3 algorithm for face
detection. YOLOv3 uses Darknet-53 because the backbone. Nizam et al [34] proposed a
completely unique GAN-based network, which will automatically remove mask covering the
face area and regenerate the image by building the missing hole. In [35], the authors presented a
system for detecting the presence or absence of a compulsory medical mask within the OR. The
general is to attenuate the false positive face detection as possible without missing mask
detection so as to trigger alarms just for medical staff who don't wear a surgical mask. Shaik et al
[36] used deep learning real-time face emotion classification and recognition. They used VGG-
16 to classify seven countenances. Under the present Covid-19 lock-in time, this technique is
effective in preventing spread in may use cases. Here are some use cases which will benefit from
system.

Airports: the proposed system could also be vital find travelers at airports. there's no mask. The
traveler’s data are often captured as a video within the system at the doorway. Any passenger
who finds no mask will alert the airport authorities send in order that they can act quickly.
Hospital: the proposed system is often integrated with CCTV cameras, and therefore the data are
often managed to ascertain if its employees are wearing masks. If you discover some doctors. If
they aren't wearing a mask, they're going to receive a reminder to wear a mask.
Office: The proposed system can help to take care of safety standards to stop the spread of
covid-19 or any such airborne disease. If some employees aren’t wearing masks they’re going to
receive reminders to wear mask. The choice of the system must be supported the simplest
performance. So, I'm using the simplest system performance indicators in order that you’ll large
scale implementation. The system has been used with the MobileNetV2 classifier.
MobileNetV2 [37]: MobileNetV2 is that the latest technology of mobile visual recognition,
including classification, object detection and semantic segmentation. The classifier uses deep
intelligent separable convolution, its purpose is to significantly reduce the complexity cost and
model size of the network, so it's suitable for mobile devices, or devices with low computing
power. In MobileNetV2, another best module introduced is that the reverse residual structure.
The nonlinearity within the narrow layer is removed. Maintain because the backbone of feature
extraction, MobileNetV2 achieves the simplest performance in object detection and semantic
segmentation.
In [38], talks about the use of MTCNN for the detection of masked faces. Face
recognition is a promising area in the field of computer vision. Some devices use Face
recognition as an alternative to a fingerprint scanner. CNN has the ability to learn valuable
features by itself. The author used IIIT-Delhi masked face images dataset and applied data
augmentation to enlarge the dataset so that reliability and efficiency can be improved. They used
a pre-trained Multi-task Cascaded Convolutional Neural Network (MTCNN) for the detection of
faces from the dataset. MTCNN outperforms many other face-detection tools. It works in 3
stages. First, it creates multiple copies of the images of different scales. This is called an image
pyramid. The first stage is called the P-Net or Proposal Network. It introduces candidate facial
regions. The second stage is the R-Net or refinement network. It refines the bounding boxes. The
third and final stage is O-Net or Output Network. It determines the final landmarks on the image.
In image post processing, the images are cropped and resized according to the FaceNet
Specification i.e. 160x160. A pre-trained FaceNet Model was used as a baseline for deep
networks. It used 22 deep convolutional layers. A large number of images of masked and
unmasked faces were used to train the model. The classification was done with the help of the
Support Vector Machine (SVM). The results of this methodology were promising. It gave
accuracy up to 98.50% in some datasets and cases.

In [39], proposes using two components: i) a deep neural network for identifying single
or more than one riders on a motorbike by using the YOLOv3 model and ii) another neural
network for detecting whether the rider has worn a helmet or not. In this system, the traffic
surveillance system provides input to the model and the video frames are given as input to the
CNN for detecting helmets on the riders. Initially, the YOLOv3 is used for detecting the
motorbike and the riders. The YOLOv3 model is an improved version of YOLO which was
developed by J. Redmon. The model can detect huge sets of classes; among them only two
classes i.e. person and motorbike are detected. The boxes are drawn around the target to localize
the objects. The network predicts 4 coordinates; bx, by which are the center coordinates and bw,
bh are width, height respectively of the focused target. The overlapping area between motorbike
and person is taken from the bounding boxes to determine whether the person is a motorbike
rider or not. Determine the Euclidean distance from the center coordinates of the two bounding
boxes of a person to the motorbike. If the distance is within the bounding box of the motorbike,
then it is understood that the targeted person is the rider of that motorbike. The CNN model is
then used to identify and classify whether the rider is wearing a helmet or not. For this, the top
one-fourth part of the identified motorbike rider is sent as input from the output received from
the YOLOv3 model.

The CNN model consists of five layers of which the input layer takes the input from the
input image and passes the image through consecutive convolutional layers where each layer
transforms the image using specific features and sends it to the next layer. Each layer filters the
input image given and extracts the required features with plenty of differentiating attributes to
distinguish the target object from other objects. After these five layers, additional two layers are
added which are connected. Depending on the image extracted, the softmax classifier classifies
the object to predict classes with probability distribution as wearing a Helmet or not wearing a
helmet. The CNN predicts bounding boxes along with class probabilities for accuracy of
prediction. In the detection process, the input image is divided into an N×N grid. This grid is
responsible for object detection of any kind of object that falls into that grid’s cell. Each
bounding box consists of 4 measures: px, py, w, h where (px, py) coordinates represent the
center of the box relative to the bounds of the grid cell. The height (h) and width (w) are
predicted relative to the whole image.

The proposed paper explains in detecting single, multiple riders or basically all riders of a
motorbike who are not wearing helmets from traffic surveillance videos. The First YOLOv3
model has been used for motorcyclist detection. Then, the proposed lightweight convolutional
neural network detects the wearing of a helmet or no helmet for all motorbike riders. This project
performs better than other CNN based helmet detection methods and can be extended in the
future to detect more complicated cases of several riders including child riders. As YOLOv3 and
CNN models detect a person's face accurately from a given image and can tell whether a person
is wearing a helmet or not, so one can also use these models to determine if a person is wearing a
face mask or not.

In [40], proposed a new technique of helmet detection which combines two methods in
order to make the detection rate better. Those two methods are i) Haar like feature and ii) Circle
Hough transform. By using these methods the system detects whether a person is wearing a full
helmet or half helmet. When the system receives video input it first separates the images from
video then uses a Haar like feature for detection of a full helmet. As we know the human face is
full of contrast (e.g. eye region is darker than the cheek region), Haar like feature uses these
contrasts to encode the human face, nose, mouth, eyebrows, right eye, left eye. This paper has
used 14 feature prototypes to encode the features which include Edge features(4), Center
surround features(2), Line features (8). For each image of 24 x 24 sub-window, there are more
than 1,17,000 rectangular features so to select only specific rectangles weak learning algorithm
have been used. To boost the performance of classification they have used the AdaBoost
classifier. And to increase the detection efficiency they have used a cascade classifier which also
reduces the computation time radically. For detection of half helmet circle hough transform
methods have been used by authors. This method not only detects the circular shapes but also
any kind of shapes in the given picture which makes it easy to locate helmets, and hence it makes
it possible to detect half helmets. This paper has overcome different issues which were raised
before while detecting full and half helmets. They have tested this algorithm in real-time and the
results are very positive. This paper proposed a new technique of masked face detection by
taking the help of video analytics which combines four steps in order to make the detection rate
better. The four steps are: i) Distance from camera ii) Eye line detection iii) Facial part detection
iv) Eye detection. Video analytics deals with the detection of people and events like walking,
falling, standing at the camera.

In [41], uses a technique called Analog Devices Inc.’s Cross Core Embedded Studio
(CCES) in addition to HOGSVM for person detection it also describes the method of Histogram
of Oriented Gradients (HOG) which is a feature set based on evaluating well-normalized local
histograms of image gradient orientations in a dense grid. As compared to the best Haar wavelet-
based detector it gives good results for person detection, reducing false-positive rates relatively.
The main idea is to detect whether a person is wearing a mask or not. So if a person is detected
but their face is not detected, then it can be considered that a person is wearing a mask. But this
will be also true in a situation like a person facing in the direction opposite to the camera, in such
a situation it will detect the person but not their face and will give the wrong output. Therefore to
deal with such scenarios, it is important to find out if a person is coming towards or going away
from the camera.

To determine whether a person is approaching a camera or going away from a camera the
author has discussed four steps in this paper. As it is a four steps execution. The first step is the
Distance from the camera method. This method is used to see if a person is approaching the
camera or going away from a camera. As the decreasing distance between a person and a camera
indicates that the person is approaching the camera and face detection can be triggered. The
second step is the eye line detection method. This method helps to find out the valley in
horizontal histogram projection. If the eye line is detected, face detection can be applied to see if
the person is wearing a mask or not. The third step facial part detection. In this method, the
author has used Viola Jones’s algorithm to detect facial parts like nose, mouth, eyes, eyebrows,
etc. This algorithm results in a very high true detection rate and a very low false positive rate
which will be shown in the cases where a person is not wearing a mask. If any person is wearing
a mask or his/her face is covered with cloth or hand, then in such cases the detection of the face
might not take place, or face detection will take place but either nose or mouth will not be
detected indicating it as a mask. The final and most important step is to find out the eyes and
then trigger the face detection using the eye detection method. If the person is not wearing a
mask, eyes will be detected and face detection can thus be applied. When a person is wearing a
mask, eye detection returns true but face detection returns false indicating it is a mask.

This paper has stated that in video analytics, the false detection rate is maximum in eye
line detection algorithms as well as in eye detection algorithms. The reason is that eye line
detection and eye detection will detect very small parts of the image. For images with poor
resolution, it will result in false detection. For facial part detection, the execution time is
maximum as compared to all other steps as it deals with face detection followed by face parts
detection which is a complex algorithm. This paper has a detailed explanation of how to detect a
face mask and the authors have tested these above steps in real time and the results are quite
practical and satisfactory.
Dewantara et al [72] exploited to train a nose and mouth classifier to detect multi-pose
masked faces. The authors create a dataset of nose and mouth. Haar-like, LBP, HOG features are
exploited for training models, respectively. If nose and mouth is not detected, the candidate
facial region will be labeled “masked”. Otherwise, it will be labeled “No mask”. It is reported
that the trained classifier of nose and mouth achieves an accuracy of 86.9% using haar-like
features, outperforming LBP and HOG. Obviously, there is further space to improve accuracy.
Petrovic et al [73] developed an indoor safety IoT system which adopts multiple
AdaBoost cadcade-classifiers. These classifiers are provided by OpenCV to detect frontal face,
nose, and mouth, respectively. For a candidate face region, if no mouth and no nose are detected,
it will be regarded as wearing a mask properly. If nose is detected, it will be labeled as “improper
mask”. If mouth is detected, it will be labeled as “no mask”. This approach may work well in the
access control system by OpenCV classifiers. However, it depends on OpenCV classifiers too
much, and it does not provide details about accuracy.
Unlike methods [73], Nieto-Rodriguez et al [69] used two AdaBoost detectors to
implement surgical mask detection. One detector is trained by LogitBoost for face detection, and
the other is trained by GentleAdaBoost for mask detection. Then, two color filters in the HSV
color space are employed to eliminate false positives. Considering the overlapping regions, cross
class removal strategy is designed to keep the region with higher confidence. The method is easy
to implement and it achieves an accuracy of 95% on 496 faces and 181 masks.
Fang et al [75] developed a real-time system of masked facial detection that uses haar-
like features for face detection and mouth detection, respectively. Similar with [73], face region
is firstly located, and then mouth detection is used to determine the mask-wearing conditions.
The designed algorithm is claimed to run on PYNQ-Z2 SoC platform with 0.13s response of
facial mask detection and 96.5% accuracy on given dataset.
In addition, Tengjiao He [76] employed skin color and eye detection to reach the goal of wearing
mask detection. The first step is to locate face region using ellipse skin model and geometric
relationship between eyes and other facial parts. Then, the coverage of skin color in the bottom
half of facial region is calculated to judge mask-wearing conditions. However, this method can
only be applied to specific scenes.
Razavi et al [77] employed Faster R-CNN structure to detect people who do not wear a
mask or do not maintain a safety distance. It was applied to several road maintenance projects for
monitoring workers, ensuring them wear masks and keep proper physical distance. However, the
dataset is limited and it only focuses on construction scenes. Meivel et al [23] used Faster R-
CNN algorithm for mask detection and social distance measurement. This method achieves
93.4% accuracy for complex scenes such as facial poses, beard faces, multiple mask types, and
scarf images. Notably, the effects need improvement when converting surveillance images into
bird-view images.
Zhang et al [47] developed a new framework for masked facial detection called Contex-
Attention R-CNN, which consists of multiple context feature extractor component, decoupling
branches component, and attention component. It is able to enlarge intra-class difference and
reduce inter-class difference through extracting distinguishing features. They also created a
dataset that includes 8635 faces with different conditions for experimental verification. The
framework can achieve mAP = 84:1% on the given dataset, 6.8% higher than that of Faster R-
CNN with ResNet-50. However, the dataset is classes imbalanced.
Chowdary et al [78] exploited InceptionV3 pre-trained model to classify one whether
wears a mask or not. The last layer of InceptionV3 is replaced by 5 layers, which is regarded as a
transfer learning model. It is reported to reach a 99.9% on a simulated dataset.
Dey et al [60] proposed a MobileNet-
Mask to prevent the transmission of SARS-COV-2, which is a deep learning method of
multi-phase facial mask detection. The mask classifier depends on the ROI detection of SSD and
ResNet-10. Due to the minimal processing capability and lightweight mobile-oriented model,
MobileNet-V2 is a good selection for embedded systems. It is reported to achieve higher
accuracy than other methods.
Deng et al [79] introduced attention mechanisms, inverse convolution and feature fusion
to SSD structure for the task of wearing mask detection. It achieves a mAP of 91.7%,
outperforming SSD with 85.4% mAP. Wang et al [80] proposed a holistic edge computing
framework to detect masked faces. It is a serverless in-browser solution by integrate YOLO,
CNN inference computing, and Web Assembly techniques. This design minimizes extra devices.
It has easy deployment, low computation costs, fast detection speed, and achieves mAP = 89%.
Loey et al [42] developed a YOLOv2 with ResNet-50 detector for medical face mask
detection. The method includes two parts. The first is designed by deep transfer learning for
feature extraction. The second part is implemented by YOLOv2 for masked face detection.
Specially, mean IoU is introduced to estimate the best number of anchor boxes and it can
improve the accuracy. The method achieves AP = 81% on a dataset with 1415 images.
Jiang et al [50] designed Squeeze and Excitation(SE) YOLOv3 to balance the
effectiveness and running speed for masked facial detection. It introduces SE into Darknet-53 as
attention mechanism integration to extract essential feature, and adopts GIoUloss, focal loss to
enhance stability and robustness. A new dataset called Properly Wearing Masked Face Detection
(PWMFD) Dataset is created for three categories of masked faces. It is reported that the method
achieves mAP = 73:7% for 608 _ 608 size of images. The method is expected to use in access
control gate system and non-contact temperature measurement. However, the similarity between
incorrect masks is high. It may bring confusions that masks only covering chin are regarded as
without mask.
Prusty et al [26] proposed a data augmentation technique to expand dataset size. New
dataset is used to train YOLOv3 model for masked facial detection. Average accuracy is more
than 93% on given three datasets. However, only two kinds of data augmentation techniques
(grayscale and Gaussian blur) are used. The number is very limited. Kumar et al [51] explored to
test original and tiny variants of YOLO on a new face mask detection dataset which
encompasses 52635 images. For the dataset, over 50k labels are provided. Modified tiny
YOLOv4 is recommended as an effective and efficient masked face detector because of its
optimized feature extraction network.
Yu et al [31] improved YOLOv4 model by introducing a modified CSPDarkNet53 to
reduce computation costs and enhance learning ability. An adaptive image scaling algorithm is
designed to reduce redundancy and an improved PANet structure is used to learn more semantic
information. It is reported to achieve 98.3% accuracy with 54.57 fps under the running
environment of Windows 10, Inter(R)i7-9700k and RTX 2070Super. One limitation is
inconsideration of insufficient lighting samples.
Sharma [85] developed a model that uses
YOLOv5 to detect whether one person is wearing a mask or not. However, if an
individual does not face the camera, its performance will decrease. This is the method’s
limitation. Yang et al [87] applied YOLOv5 in the supervision of wearing mask conditions. The
authors design a man-machine interface for application and set the identifying time for 2 seconds
with the consideration of complex scenes. A 97.9% recognition rate is achieved on the dataset
[62]. It seems the response time is a bit longer. Ieamsaard et al [88] tested the performance of
YOLOv5-based model with 300 epochs, outperforming those models with less than 300 epochs.
Jiang et al [11] proposed RetinaFaceMask for masked face detection, which is based on
RetinaFace [95]. RetinaFaceMask is a single-stage detector. Its principle is to employ feature
pyramid network to fuse high-level semantic information. A novel context attention module is
presented to help RetinaFaceMask focus on the features of faces and masks. Moreover, a cross-
class removal algorithm is proposed to remove those regions with low scores and high IoU
values. Experiments demonstrate that RetinaFaceMask outperforms RetinaFace [95] in Recall
and Precision. Moreover, there are more experimental comparisons between methods. Singh et al
[48] utilized two object detection models named Faster R-CNN and YOLOv3 for masked facial
detection. They presented the comparison from visual and quantitative views, and gave detailed
discussions about the application. Faster R-CNN outperforms YOLOv3 in the accuracy,
however, for real-time application; it would be preferred to use YOLOv3 which runs faster than
Faster R-CNN. The selection of model depends on the environment conditions. Similar
conclusion is drawn in [96]. Roy et al [43] used SSD, Faster R-CNN, YOLOv3, and
YOLOv3Tiny to cope with the challenges of wearing medical mask detection. These methods
are tested on Moxa3K dataset. Experimental results demonstrate that YOLOv3Tiny is the most
suitable method for real-time inference among the methods.
Loy et al [12] developed a hybrid method of deep learning and machine learning to detect
facial mask. It includes two components (or stages): ResNet-50 is used as feature extractor, and
SVM, decision tree, ensemble method are used as classification models. The authors claimed that
SVM classifier achieves testing accuracy of 99.49% in SMFD dataset [61], outperforming
decision tree and ensemble method. Similar with [12], the methods [118], [119] also choose
SVM as the classifier in the second stage.
Buciu [118] took the ratio of color channels into account to discriminate mask and no-
mask images. SSD is used to locate the positions of faces. Then the lower part of face is
considered to construct feature vector called color quotient feature, which will be classified by
SVM model. A recognition rate of 97.25% is obtained. However, this method is sensitive to
mask types, which is its potential weakness. Oumina et al [119] presented several combinations
of multiple CNNs and K-NN or SVM, and conducted experiments. It indicates that the
combination of MobileNetV2 and SVM achieves the best performance among the combinations,
97.11% accuracy. More tests for the approach should be conducted on bigger datasets.
Zereen et al [120] developed a two-stage approach to detect masked face and monitor the
rule violations. It is based on the extraction of facial landmark. It firstly determines whether the
target wears a multi-color mask or not by MTCNN, and secondly it determines whether the
target wears a skin-color mask or not. The method aims to detect five types of facial images
including no mask, beard and mustache, one-color mask, multi-color mask and skin-color mask.
It achieves an accuracy of 97.13% and overcomes the problem of various color mask detection,
especially differentiates wearing skin colored masks. However, the use of several techniques
needs more computation costs, and the setting of empirical thresholds limits its adaptation
ability.
Lin et al [22] combined a sliding window algorithm with a modified LeNet (MLeNet) to
locate masked faces. To improve performance with a small dataset, horizontal reflection is used
to learn MLeNet via fine-tuning. MLeNet can be trained fast under CPU mode. It makes sense
for real-world applications. However, sliding window algorithm requires more computations for
large size of images, which restricts its performance. Rudraraju et al [122] combined haar-like
cascade-classifiers and two MobileNet models for face mask detection. Firstly, face regions are
detected by haar-like cascade-classifier. The first MobileNet model is used to classify masks and
no masks. The second MobileNet model is used to distinguish correct or incorrect wearing
masks. Experiments show that the system achieves around Accuracy = 90%. It is expected to be
deployed at fog gateway.
Tomas et al [33] also chose haar-like cascade classifier for rapid facial detection. CNN
with transfer learning is used to determine whether one wears a mask or not. Multiple models are
trained based on one dataset. VGG16 achieves the best performance with 0.834 accuracy, but its
model size is also the largest. For deploying mobile device, MobileNetV2, with 0.812 accuracy,
is selected as the classification model because it demands less computation costs and smaller
storage. However, this method needs to be improved when detecting masked facials with
alterations and sides.
The method proposed by Lin et al [129] contains five stages: image data collection,
human posture parsing, ROI selection, image normalization, and classification of masked face.
Among these stages, human posture parsing is implemented by Openpose [135] that generates 25
key points for one individual. Five key points belonging to face region are used to extract ROI
for image normalization. Then, the normalized image is classified by a Face Mask Recognition
Network (FMRN). It is reported that the method obtains 95.8% and 94.6% accuracy in daytime
and nighttime, respectively.
Table 2.1 represents the few notable works for the detection of face mask along with their
advantages and disadvantages.
Table 2.1 Comparison of various deep learning architectures

S.N Title Advantages Disadvantages

o
1 Performance Evaluation The system is evaluated With The best framework
of Intelligent Face Mask different classifiers. Different might be executed
Detection System with classifiers like MobileNetV2, alongside interfacing
various Deep Learning Resnet50, and VGG16, each of with caution and
Classifiers them was compared with alarming frameworks
Optimizers like ADAM, Adagrad, soon. This framework
and SGD to yield the highest might be incorporated
accuracy. ADAM gave maximum with a framework which
accuracy. can coordinate with a
framework actualizing
social distancing which
can make it a healthy
framework that can
welcome emotional
effect on the spread.
2 Covid-19 facemask The system comprises of In future studies, a more
detection with deep MobileNet as the spine which can extensive facemask
learning and computer be very well utilized for high and wearing dataset
vision low calculation situations. In order including images and
to extract more robust features, videos will be collected
learning is used to gain weights and labeled with more
from a similar task face detection, details in order to
which is trained on large datasets. improve the
The proposed method accomplishes performance.
state-of-art results on public face
mask dataset. By the advancement
of face mask detection, it can be
detected whether a person is
wearing face mask or not and
permit their entry would be of great
help to the society.
3 A Cascade Framework A deep-learning based algorithm The system can be used
For Masked Face for masked face detection is in CCTV footages to
Detection proposed. This algorithm is based identify whether a
on a recently planned CNN course person is wearing a
structure comprises of three CNNs. mask correctly so that
Additionally; another dataset called he does not pose any
“MASKED FACE dataset” is hazard to others.
proposed which have 160 images
for training and 40 images for
testing. To defeat the overfitting
issue due to lack of training
samples, the model is pre-trained
with WIDER FACE dataset and
fine-tuned them with MASKED
FACE training set. This masked
face detection algorithm is assessed
on the MASKED FACE testing set,
it achieves satisfactory results.
4 Multi-Stage CNN A two-stage face mask detector was At present, the model
Architecture for Face introduced. First stage uses pre- gives 5 FPS inference
Mask Detection trained RetinaFace model for face speed on a CPU. We
detection, after comparing its plan to improve this up
performance with Dlib and to 15 FPS, making our
MTCNN. Second stage uses model deployable for
NASNetMobile based model for CCTV cameras, without
classifying faces as “masked” or need of a GPU. Stage 1
“unmasked”. Besides, centroid and stage 2 models can
tracking is used to improve be easily replaced with
performance on video streams. improved models that
would give better
accuracy and lower
latency
5 Deep Learning Framework In this work, methodology for Mass screening is little
to Detect Face Masks from recognizing face masks from videos is difficult in crowded places
Video Footage proposed. A profoundly successful like railway stations, bus
face identification model is utilized stops, streets, schools,
for getting facial pictures and signals. colleges, etc. so the system
A particular facial classifier is yielding better accuracy
constructed utilizing deep learning for can be created
the errand of deciding the presence of
a face mask in the facial pictures
distinguished. The subsequent
methodology is strong and is assessed
on a custom dataset got for this work.
The proposed approach was
discovered to be successful as it
depicted high exactness, review, and
precision esteems on the picked
dataset which contained videos with
fluctuating occlusions and facial
angles.
6 A convolutional neural A deep convolutional neural network Facial acknowledgment is
network (CNN) approach (CNN) is utilized to extricate a powerful apparatus that
to detect face using highlights from input pictures. Keras can help law makers
tensorflow and keras is utilized for actualizing CNN perceive lawbreakers and
additionally Dlib and OpenCV for software organizations are
adjusting faces on information utilizing the innovation to
pictures. Face acknowledgment help clients access their
execution is assessed utilizing a innovation. This
custom dataset. innovation can be
additionally evolved to be
utilized in different
avenues like ATMs,
getting to private records,
or other sensitive
materials. This can make
other safety efforts, for
example, passwords and
keys obsolete. Another
way that innovators are
looking to execute facial
acknowledgment is inside
subways and other
transportation outlets.
They are hoping to use this
innovation to use faces as
credit cards to pay for our
transportation charge.
7 A Review on By the improvement of AI and image Facial mask detection and
Face Mask processing analysis present strategies non-masked face detection
Detection using for mask detection. By utilizing image accuracy provided high
Convolutional processing analysis and AI technique variations.
Neural Network is utilized for finding out mask
detection. Face mask identification
can be done through different
strategies. Essentially convolutional
neural network technique is utilized
quickly. The precision and decision
making are exceptionally high in CNN
contrasted with others.
8 An Automated System to In this work, a framework is proposed The created framework
Limit COVID-19 Using that confine the development of faces challenges in
Facial Mask Coronavirus by discovering arranging faces covered by
Detection in Smart City individuals who are not wearing any hands since it nearly
Network facial mask in a smart city network resembles the individual
where all the public spots are checked Wearing a mask. While
with Closed-Circuit Television any individual without a
(CCTV) cameras. While an individual face mask is going on any
without a mask is identified, the vehicle, the framework
corresponding authority is informed can't find that individual
through the city network. A deep accurately. For a thickly
learning design is prepared on a populated zone,
dataset that comprises of pictures of recognizing the face of
individuals with and without masks every individual is
gathered from different sources. The troublesome. For this sort
trained architecture accomplished of situation, recognizing
98.7% precision on recognizing individuals without face
individuals with and without a facial mask would be very hard
mask for previously unseen test data. for the proposed
framework. To get the best
result out of this
framework, the city should
have an enormous number
of CCTV cameras to
screen the entire city as
well as dedicated
manpower to uphold
appropriate laws on the
violators. Since the data
about the violator is sent
through SMS, the
framework fails when
there is an issue in the
network.
9 Masked Face Recognition It is urgent to improve the MFDD, RMFRD and
Dataset and Application acknowledgment execution of the SMFRD datasets are built,
current face recognition innovation on and built up a state-of-the-
the masked faces. Most current art algorithm dependent on
progressed face acknowledgment these datasets. The
approaches are planned dependent on calculation will serve the
deep learning, which rely upon an utilizations of contactless
enormous number of face samples. face authentication in local
Notwithstanding, at present, there are area access, campus
no freely accessible masked face management, and
recognition datasets. To this end, this enterprise resumption
work proposes three sorts of masked scenarios. This exploration
face datasets, including Masked Face has contributed logical and
Detection Dataset (MFDD), Real innovative capacity to the
world Masked Face Recognition avoidance and control of
Dataset (RMFRD) and Simulated Covid epidemics and the
Masked Face Recognition Dataset resumption of creation in
(SMFRD). Among them, to the best of industry. Moreover,
our knowledge, RMFRD is as of now because of the continuous
the world's biggest real-world masked event of haze weather,
face dataset. These datasets are freely individuals will frequently
accessible to industry and the wear covers, and the
academia, based on which different requirement for face
applications on masked faces can be recognition with mask will
created. The multigranularity masked continue for quite a while.
face recognition model created
accomplishes almost 95% accuracy,
exceeding the outcomes reported by
the industry.
10 Fighting against COVID- The target of this paper is to comment As a further study, it is
19: A novel deep learning on and confine the clinical face mask intended to distinguish a
model based on YOLO-v2 objects, all things considered, pictures. kind of masked faces in
with ResNet-50 for Wearing a clinical face mask in open picture and video based on
medical face mask territories, ensure individuals from deep learning models.
detection COVID-19 transmission among them.
The accomplished results presumed
that the adam optimizer accomplished
the most elevated normal accuracy
level of 81% as a detector.
CHAPTER 3
CONVOLUTIONAL NEURAL NETWORKS
Deep learning is a sub-branch of the machine learning field, inspired by the structure of
the brain. Deep learning (DL) models attempt to learn high-level features automatically from the
input images through the hierarchical architecture. Among all DL methods, CNN has been
achieved great success for analyzing medical images. CNN consists of several layers including
convolutional layer, pooling layer, activation layer, Batch normalization layer and fully
connected layer, dropout layer and softmax layer to perform these operations effectively. The
feature extraction process takes place in both convolutional and pooling layers. On the other
hand, the classification process occurs in fully connected layer and softmax layer.
3.1. Convolution Layer:
A convolution layer transforms the input image in order to extract features from it. In this
transformation, the image is convolved with a kernel (or filter).

Fig 3.1: Convolution operation on image

A kernel is a small matrix, with its height and width smaller than the image to be
convolved. It is also known as a convolution matrix or convolution mask. This kernel slides
across the height and width of the image input and dot product of the kernel and the image are
computed at every spatial position. The length by which the kernel slides is known as
the stride length. In the image below, the input image is of size 5X5, the kernel is of size 3X3
and the stride length is 1. The output image is also referred to as the convolved feature.
Fig 3.2: Convolution operation example
When we want to extract more than one feature from an image using convolution, we can
use multiple kernels instead of using just one. In such a case, the size of all the kernels must be
the same. The convolved features of the input image the output are stacked one after the other to
create an output so that the number of channels is equal to the number of filters used. See the
image below for reference.

Fig 3.3: Convolution operation on RGB image

3.2. Activation Layer:
An activation function is the last component of the convolutional layer to increase the
non-linearity in the output. This decides which information of the model should fire in the
forward direction and which ones should not at the end of the network. Generally, ReLu function
or Tanh function is used as an activation function in a convolution layer. Here is an image of a
simple convolution layer, where a 6X6X3 input image is convolved with two kernels of size
4X4X3 to get a convolved feature of size 3X3X2, to which activation function is applied to get
the output, which is also referred to as feature map.
Figure 3.4: ReLU activation function

Fig 3.5: Activation function

3.3. Batch normalization:
The normalization is mainly used to normalize the features which are obtained from a
convolutional layer. In this work, we used batch normalization and it is also called batch norm.
The major advantage of the usage of a batch norm:
1. To improve the network speed
2. To enhances the network performance by smoothening the image.
3. It reduces the over fitting problems.
3.4. Pooling Layer
Pooling layer is used to reduce the size of the input image. In a convolutional neural
network, a convolutional layer is usually followed by a pooling layer. Pooling layer is usually
added to speed up computation and to make some of the detected features more robust. Pooling
operation uses kernel and stride as well. In the example image below, 2X2 filter is used for
pooling the 4X4 input image of size, with a stride of 2.
There are different types of pooling.
1. Max pooling
2. Average pooling is the most commonly used pooling method a convolutional
neural network.

Fig 3.6: Pooling operation on image

Max Pooling: In max pooling, from each patch of a feature map, the maximum value is selected
to create a reduced map.
Average Pooling: In average pooling, from each patch of a feature map, the average value is
selected to create a reduced map.
3.5. Fully Connected Layer
A fully connected layer is at the end of a convolutional neural network. The features map
produced by the earlier layer is flattened to a vector. Then this vector is fed to a fully connected
layer so that it captures complex relationships between high-level features. The out of this layer
is a one-dimensional feature vector.

Fig 3.7: Fully Connected layer

3.6. Dropout layer:
The Dropout layer is a mask that nullifies the contribution of some neurons towards the
next layer and leaves unmodified all others. It reduces the over fitting by dropping some feature
maps.

Fig 3.8: Dropout layer

3.7. Softmax layer:
It typically, the softmax activation function takes place at the end of the neural network
that perform binary classification by transforming the features into class probabilities. The
softmax yields a value (in range between 0, and 1) for each class based on the computation of
probabilities.
CHAPTER 4
OPTIMIZERS
4.1. Introduction
Deep learning is the subfield of machine learning which is used to perform complex
tasks such as speech recognition, text classification, etc. A deep learning model consists of
activation function, input, output, hidden layers, loss function, etc. Any deep learning model tries
to generalize the data using an algorithm and tries to make predictions on the unseen data. We
need an algorithm that maps the examples of inputs to that of the outputs and an optimization
algorithm. An optimization algorithm finds the value of the parameters (weights) that minimize
the error when mapping inputs to outputs. These optimization algorithms or optimizers widely
affect the accuracy of the deep learning model.
While training the deep learning model, we need to modify each epoch’s weights and
minimize the loss function. An optimizer is a function or an algorithm that modifies the
attributes of the neural network, such as weights and learning rate. Thus, it helps in reducing
the overall loss and improves the accuracy. The problem of choosing the right weights for
the model is a daunting task, as a deep learning model generally consists of millions of
parameters. It raises the need to choose a suitable optimization algorithm for your
application. Hence understanding these algorithms is necessary before having a deep dive
into the field. You can use different optimizers to make changes in your weights and learning
rate. However, choosing the best optimizer depends upon the application.

4.1. Gradient Descent Deep Learning Optimizer

Gradient Descent can be considered as the popular kid among the class of optimizers.
This optimization algorithm uses calculus to modify the values consistently and to achieve the
local minimum. In simple terms, consider you are holding a ball resting at the top of a bowl.
When you lose the ball, it goes along the steepest direction and eventually settles at the bottom of
the bowl. A Gradient provides the ball in the steepest direction to reach the local minimum that is
the bottom of the bowl.
The above equation means how the gradient is calculated. Here alpha is step size that
represents how far to move against each gradient with each iteration. Gradient descent works as
follows:

1. It starts with some coefficients, sees their cost, and searches for cost value lesser than
what it is now.
2. It moves towards the lower weight and updates the value of the coefficients.
3. The process repeats until the local minimum is reached. A local minimum is a point
beyond which it cannot proceed.

Figure 3.1. Gradient descent

Gradient descent works best for most purposes. However, it has some downsides too. It is
expensive to calculate the gradients if the size of the data is huge. Gradient descent works well
for convex functions but it doesn’t know how far to travel along the gradient for nonconvex
functions.

4.2. Stochastic Gradient Descent Deep Learning Optimizer

To tackle the problems in gradient descent algorithm, we have stochastic gradient

descent. The term stochastic means randomness on which the algorithm is based upon. In
stochastic gradient descent, instead of taking the whole dataset for each iteration, we randomly
select the batches of data. That means we only take few samples from the dataset.

The procedure is first to select the initial parameters w and learning rate n. Then
randomly shuffle the data at each iteration to reach an approximate minimum. Since we are not
using the whole dataset but the batches of it for each iteration, the path took by the algorithm is
full of noise as compared to the gradient descent algorithm. Thus, SGD uses a higher number of
iterations to reach the local minima. Due to an increase in the number of iterations, the overall
computation time increases. But even after increasing the number of iterations, the computation
cost is still less than that of the gradient descent optimizer. So the conclusion is if the data is
enormous and computational time is an essential factor, stochastic gradient descent should be
preferred over batch gradient descent algorithm.

4.3. Adagrad (Adaptive Gradient Descent) Deep Learning Optimizer

The adaptive gradient descent algorithm is slightly different from other gradient descent
algorithms. This is because it uses different learning rates for each iteration. The change in
learning rate depends upon the difference in the parameters during training. The more the
parameters get change, the more minor the learning rate changes. This modification is highly
beneficial because real-world datasets contain sparse as well as dense features. So it is unfair to
have the same value of learning rate for all the features. The Adagrad algorithm uses the below
formula to update the weights. Here the alpha(t) denotes the different learning rates at each
iteration, n is a constant, and E is a small positive to avoid division by 0.
The benefit of using Adagrad is that it abolishes the need to modify the learning rate
manually. It is more reliable than gradient descent algorithms and their variants, and it reaches
convergence at a higher speed.

One downside of AdaGrad optimizer is that it decreases the learning rate aggressively
and monotonically. There might be a point when the learning rate becomes extremely small. This
is because the squared gradients in the denominator keep accumulating, and thus the
denominator part keeps on increasing. Due to small learning rates, the model eventually becomes
unable to acquire more knowledge, and hence the accuracy of the model is compromised.

4.4. RMS Prop (Root Mean Square) Deep Learning Optimizer

RMS prop is one of the popular optimizers among deep learning enthusiasts. This is
maybe because it hasn’t been published but still very well know in the community. RMS prop is
ideally an extension of the work RPPROP. RPPROP resolves the problem of varying gradients.
The problem with the gradients is that some of them were small while others may be huge. So,
defining a single learning rate might not be the best idea. RPPROP uses the sign of the gradient
adapting the step size individually for each weight. In this algorithm, the two gradients are first
compared for signs. If they have the same sign, we’re going in the right direction and hence
increase the step size by a small fraction. Whereas, if they have opposite signs, we have to
decrease the step size. Then we limit the step size, and now we can go for the weight update.

The problem with RPPROP is that it doesn’t work well with large datasets and when we
want to perform mini-batch updates. So, achieving the robustness of RPPROP and efficiency of
mini-batches at the same time was the main motivation behind the rise of RMS prop. RMS prop
can also be considered an advancement in AdaGrad optimizer as it reduces the monotonically
decreasing learning rate.
The algorithm mainly focuses on accelerating the optimization process by decreasing the
number of function evaluations to reach the local minima. The algorithm keeps the moving
average of squared gradients for every weight and divides the gradient by the square root of the
mean square.

where gamma is the forgetting factor. Weights are updated by the below formula

In simpler terms, if there exists a parameter due to which the cost function oscillates a lot,
we want to penalize the update of this parameter. Suppose you built a model to classify a variety
of fishes. The model relies on the factor ‘color’ mainly to differentiate between the fishes. Due to
which it makes a lot of errors. What RMS Prop does is, penalize the parameter ‘color’ so that it
can rely on other features too. This prevents the algorithm from adapting too quickly to changes
in the parameter ‘color’ compared to other parameters. This algorithm has several benefits as
compared to earlier versions of gradient descent algorithms. The algorithm converges quickly
and requires lesser tuning than gradient descent algorithms and their variants. The problem with
RMS Prop is that the learning rate has to be defined manually and the suggested value doesn’t
work for every application.

4.5. AdaDelta Deep Learning Optimizer

AdaDelta can be seen as a more robust version of AdaGrad optimizer. It is based upon
adaptive learning and is designed to deal with significant drawbacks of AdaGrad and RMS prop
optimizer. The main problem with the above two optimizers is that the initial learning rate must
be defined manually. One other problem is the decaying learning rate which becomes
infinitesimally small at some point. Due to which a certain number of iterations later, the model
can no longer learn new knowledge.
To deal with these problems, AdaDelta uses two state variables to store the leaky average
of the second moment gradient and a leaky average of the second moment of change of
parameters in the model.

Here St and delta Xt denotes the state variables, g’ t denotes rescaled gradient, delta X t-
1 denotes squares rescaled gradients, and epsilon represents a small positive integer to handle
division by 0.
CHAPTER 5
PROPOSED METHODOLOGY
4.1. Algorithm
In Figure 5.1, it starts with data collection. In this case study, the data is collected from an
available Kaggle dataset. Then it is loaded to perform pre-processing in-order to clean the
collected data. Then the data is split into training and testing set. Training dataset will be used to
train the model while testing data will be used to test the model. A fully trained and tested model
results in effective and accurate detection of the presence or absence of a face mask.

Figure 5.1. Flow chart of the proposed method

Start
1: Load the Images dataset belonging to 2 classes by reading the path.
2: Label the dataset into 2 Categories - "With Mask" & "Without Mask”.
3: Perform pre-processing of the Image Dataset.
4: Execute splitting of the dataset into training and testing set and carry out data
augmentation on chosen dataset
5: Build the Face Mask Recognition Classification Model on Training set using feature
extractor AlexNet.
6: Evaluate the model by performing testing.
7: Serialize and save the Classification model into the disk
8: Load the Classification Model from the disk and deploy it for Real- Time Detections to
detect presence or absence of mask
End
4.2. AlexNet
AlexNet architecture consists of eight layers: five convolutional layers and three fully-
connected layers. But this isn’t what makes AlexNet special; these are some of the features used
that are new approaches to convolutional neural networks:
ReLU Nonlinearity:
AlexNet uses Rectified Linear Units (ReLU) instead of the tanh function, which was
standard at the time. ReLU’s advantage is in training time; a CNN using ReLU was able to reach
a 25% error on the CIFAR-10 dataset six times faster than a CNN using tanh.
Multiple GPUs:
Back in the day, GPUs were still rolling around with 3 gigabytes of memory (nowadays
those kinds of memory would be rookie numbers). This was especially bad because the training
set had 1.2 million images. AlexNet allows for multi-GPU training by putting half of the model’s
neurons on one GPU and the other half on another GPU. Not only does this mean that a bigger
model can be trained, but it also cuts down on the training time.
Overlapping Pooling:
CNNs traditionally “pool” outputs of neighboring groups of neurons with no overlapping.
However, when the authors introduced overlap, they saw a reduction in error by about 0.5% and
found that models with overlapping pooling generally find it harder to overfit.
Figure 5.2. Illustration of AlexNet’s architecture.
The Overfitting Problem:
AlexNet had 60 million parameters, a major issue in terms of overfitting. Two methods
were employed to reduce overfitting:
Dropout:
This technique consists of “turning off” neurons with a predetermined probability (e.g.
50%). This means that every iteration uses a different sample of the model’s parameters, which
forces each neuron to have more robust features that can be used with other random neurons.
However, dropout also increases the training time needed for the model’s convergence.
CHAPTER 6
RESULTS
CHAPTER 7
CONCLUSION and FUTURE SCOPE
The coronavirus pandemic has taught the world the importance of wearing a mask
especially in public or crowded places. The mask if worn correctly can provide protection from
the virus. If the mask is worn incorrectly it is useless and could lead to extension of the virus.
Implementing a face mask recognizer is essential to keep a check on the public and control the
virus to a certain extent. This face mask recognizer can help recognize the defaulters and rectify
them. This model can be deployed in various crowded places like airports, bus stations, markets,
offices, hospitals. This study presents a Face Recognition Classifier Model which uses extensive
deep learning, image processing techniques for face mask recognition. The model discussed is
trained with a comprehensive dataset that consists of several images of “mask”and ‘’no mask
”used for training. After different phases like training, performance correctness, and testing
stages, the model provides the probability percent- age of the mask worn by the people with high
accuracy. All the organizations must quickly approve and make use of this machine learning
techniques and new digital data assets, in order to use more unstructured data resources for more
planning, prevention against COVID-19.
REFERENCES

1. A.S. Joshi, S.S. Joshi, G. Kanahasabai, R. Kapil, S. Gupta, Deep Learning Framework to
Detect Face Masks from Video Footage, in: 2020 12th International Conference on
Computational Intelligence and Communication Networks (CICN), 2020, pp. 435–440.
2. S.M. Nagashetti, S. Biradar, S.D. Dambal, C.G. Raghavendra, B.D. Parameshachari,
“Detection of Disease in Bombyx Mori Silkworm by Using Image Analysis Approach”
2021 IEEE Mysore Sub Section International Conference (MysuruCon), IEEE (2021),
pp. 440-444.
3. R.K. Kodali, R. Dhanekula, “Face Mask Detection Using Deep Learning” 2021
International Conference on Computer Communication and Informatics (ICCCI) (2021),
pp. 1-5.
4. D.L. Vu, T.K. Nguyen, T.V. Nguyen, T.N. Nguyen, F. Massacci, P.H. Phung, “A
convolutional transformation network for malware classification” 2019 6th NAFOSTED
conference on information and computer science (NICS), IEEE (2019), pp. 234-239.
5. P. Khamlae, K. Sookhanaphibarn, W. Choensawat, “An Application of Deep-Learning
Techniques to Face Mask Detection During the COVID-19” Pandemic 2021 IEEE 3rd
Global Conference on Life Sciences and Technologies (LifeTech) (2021), pp. 298-299.
6. K. Yu, L. Tan, L. Lin, X. Cheng, Z. Yi, T. Sato, “Deep-learning-empowered breast
cancer auxiliary diagnosis for 5GB remote E-health”, IEEE Wirel.
Commun., 28 (3) (2021), pp. 54-61.
7. A. Alguzo, A. Alzu'bi, F. Albalas, “Masked Face Detection using Multi-Graph
Convolutional Networks”, 2021 12th International Conference on Information and
Communication Systems (ICICS) (2021), pp. 385-391.
8. M.S. Islam, E. Haque Moon, M.A. Shaikat, M. Jahangir Alam,, “A Novel Approach to
Detect Face Mask using CNN”, 2020 3rd International Conference on Intelligent
Sustainable Systems (ICISS) (2020), pp. 800-806.
9. P. Viola and M. Jones, “Rapid object detection using a boosted cascade of simple
features,” in Proceedings of the 2001 IEEE computer society conference on computer
vision and pattern recognition. CVPR 2001, vol.1. IEEE, 2001, pp. I–I.
10. R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate
object detection andsemantic segmentation,” in Proceedings of the IEEE conference on
computer vision and pattern recognition,2014, pp. 580–587.
11. R. Girshick, “Fast r-cnn,” in Proceedings of the IEEE international conference on
computer vision, 2015,pp. 1440–1448.
12. S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection
with regionproposal networks,” in Advances in neural information processing systems,
2015, pp. 91–99.
13. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “Ssd:
Single shot multiboxdetector,” in European conference on computer vision. Springer,
2016, pp. 21–37.
14. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-
time object detec-tion,” in Proceedings of the IEEE conference on computer vision and
pattern recognition, 2016, pp. 779–788.
15. T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Doll ́ar, “Focal loss for dense object
detection,” 2017.
16. Haddad, J., 2020. How I Built A Face Mask Detector For COVID-19 Using Pytorch
Lightning.
17. Rosebrock, A., 2020. COVID-19: Face Mask Detector With Opencv, Keras/Tensorflow,
And Deep Learning- Pyimagesearch.
18. T. Ojala, M. Pietikainen, and T. Maenpaa, “Multiresolution gray-scale and rotation
invariant texture classification with local binary patterns,” IEEE Transactions on Pattern
Analysis and Machine Intelligence, vol. 24, no. 7, pp. 971–987, July 2002.
19. T.-H. Kim, D.-C. Park, D.-M. Woo, T. Jeong, and S.-Y. Min, “Multi-class classifier-
based adaboost algorithm,” in Proceedings of the Second Sinoforeign-interchange
Conference on Intelligent Science and Intelligent Data Engineering, ser. IScIDE’11.
Berlin, Heidelberg: Springer-Verlag, 2012, pp. 122–127.
20. P. Viola and M. J. Jones, “Robust real-time face detection,” Int. J. Comput. Vision, vol.
57, no. 2, pp. 137–154, May 2004.
21. P. Viola and M. Jones, “Rapid object detection using a boosted cascade of simple
features,” in Proceedings of the 2001 IEEE Computer Society Conference on Computer
Vision and Pattern Recognition. CVPR 2001, vol. 1, Dec 2001, pp. I–I.
22. J. Li, J. Zhao, Y. Wei, C. Lang, Y. Li, and J. Feng, “Towards real world human parsing:
Multiple-human parsing in the wild,” CoRR, vol. abs/1705.07206.
23. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep
convolutional neural networks,” in Advances in Neural Information Processing Systems
25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, Eds. Curran Associates,
Inc., 2012, pp. 1097–1105.
24. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale
image recognition,” CoRR, vol. abs/1409.1556, 2014.
25. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke,
and A. Rabinovich, “Going deeper with convolutions,” 2015.
26. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,”
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–
778, 2016.
27. K. Li, G. Ding, and H. Wang, “L-fcn: A lightweight fully convolutional network for
biomedical semantic segmentation,” in 2018 IEEE International Conference on
Bioinformatics and Biomedicine (BIBM), Dec 2018, pp. 2363–2367.
28. X. Fu and H. Qu, “Research on semantic segmentation of high-resolution remote sensing
image based on full convolutional neural network,” in 2018 12th International
Symposium on Antennas, Propagation and EM Theory (ISAPE), Dec 2018, pp. 1–4.
29. S. Kumar, A. Negi, J. N. Singh, and H. Verma, “A deep learning for brain tumor mri
images semantic segmentation using fcn,” in 2018 4th International Conference on
Computing Communication and Automation (ICCCA), Dec 2018, pp. 1–4.
30. B. QIN and D. Li, “Identifying face mask-wearing condition using image super-
resolution with classification network to prevent COVID-19”, May 2020.
31. M.S. Ejaz, M.R. Islam, M. Sifatullah, A. Sarker, “Implementation of principal component
analysis on masked and non-masked face recognition”, 2019 1st International Conference
on Advances in Science, Engineering and Robotics Technology (ICASERT) (2019), pp.
15.
32. Jeong-Seon Park, You Hwa Oh, Sang ChulAhn, and Seong Whan Lee, “Glasses removal
from facial image using recursive error compensation ”, IEEE Trans. Pattern Anal. Mach.
Intell. 27 (5) (2005) 805–811.
33. C. Li, R. Wang, J. Li, L. Fei, “Face detection based on YOLOv3”, in:: Recent Trends in
Intelligent Computing, Communication and Devices, Singapore, 2020, pp. 277–284.
34. N. Ud Din, K. Javed, S. Bae, J. Yi, “A novel GAN-based network for unmasking of
masked face” IEEE Access, 8 (2020), pp. 4427644287.
35. A. Nieto-Rodríguez, M. Mucientes, V.M. Brea, “System for medical mask detection in
the operating room through facial attributes”, Pattern Recogn. Image Anal. Cham (2015),
pp. 138-145.
36. S. A. Hussain, A.S.A.A. Balushi, “A real time face emotion classification and recognition
using deep learning model”, J. Phys.: Conf. Ser. 1432 (2020) 012087.
37. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications -
Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang,
Tobias Weyand, Marco Andreetto, Hartwig Adam.
38. M. S. Ejaz and M. R. Islam, "Masked Face Recognition Using Convolutional Neural
Network," 2019 International Conference on Sustainable Technologies for Industry 4.0
(STI), Dhaka, Bangladesh, 2019, pp. 1-6.
39. M. Dasgupta, O. Bandyopadhyay and S. Chatterji, "Automated Helmet Detection for
Multiple Motorcycle Riders using CNN," 2019 IEEE Conference on Information and
Communication Technology, Allahabad, India, 2019, pp. 1-4,
40. P. Doungmala and K. Klubsuwan, "Helmet Wearing Detection in Thailand Using Haar
Like Feature and Circle Hough Transform on Image Processing," 2016 IEEE
International Conference on Computer and Information Technology (CIT), Nadi, 2016,
pp. 611-614.
41. G. Deore, R. Bodhula, V. Udpikar and V. More, "Study of masked face detection
approach in video analytics," 2016 Conference on Advances in Signal Processing
(CASP), Pune, 2016, pp. 196-200.
APPENDIX –A

A.1 Introduction
Colaboratory, or “Colab” for short, is a product from Google Research. Colab allows
anybody to write and execute arbitrary python code through the browser, and is especially well
suited to machine learning, data analysis and education. More technically, Colab is a hosted
Jupyter notebook service that requires no setup to use, while providing access free of charge to
computing resources including GPUs.
A.2 What is Google Colab?
Google Colab is an excellent tool for deep learning tasks. It is a hosted Jupyter notebook
that requires no setup and has an excellent free version, which gives free access to Google
computing resources such as GPUs and TPUs.
In Colab, we can enforce the Python version by clicking Runtime -> Change Runtime Type and
selecting python3. Note that as of April 2020, Colab uses Python 3.6. 9 which should run
everything without any errors.
A.3 How to run a python in colab
1. Store mylib.py in your Drive.

2. Open a new Colab.

3. Open the (left)side panel, select Files view.

4. Click Mount Drive then Connect to Google Drive.

5. Copy it by ! cp drive/MyDrive/mylib.py .

6. import mylib.
A.4 What types of GPUs are available in Colab?
The types of GPUs that are available in Colab vary over time. This is necessary for Colab
to be able to provide access to these resources free of charge. Users who are interested in more
reliable access to Colab’s fastest GPUs may be interested in Colab Pro and Pro+. If you would
like to use specific hardware in Colab, check out Colab GCP Marketplace VMs.
Note that using Colab for cryptocurrency mining is disallowed entirely, and may result in
your account being restricted for use with Colab altogether.
A.5 How long can notebooks run in Colab?
Colaboratory, or “Colab” for short, is a product from Google Research. Colab allows
anybody to write and execute arbitrary python code through the browser, and is especially well
suited to machine learning, data analysis and education. More technically, Colab is a hosted
Jupyter notebook service that requires no setup to use, while providing access free of charge to
computing resources including GPUs.
A.6 Where are my notebooks stored, and can I share them?
Colab notebooks are stored in Google Drive, or can be loaded from GitHub. Colab
notebooks can be shared just as you would with Google Docs or Sheets. Simply click the Share
button at the top right of any Colab notebook, or follow these Google Drive file sharing
instructions.
A.7 If I share my notebook, what will be shared?
If you choose to share a notebook, the full contents of your notebook (text, code, output,
and comments) will be shared. You can omit code cell output from being saved or shared by
using Edit > Notebook settings > Omit code cell output when saving this notebook. The
virtual machine you’re using, including any custom files and libraries that you’ve setup, will not
be shared. So it’s a good idea to include cells which install and load any custom libraries or files
that your notebook needs.
A.8 How can I search Colab notebooks?
You can search Colab notebooks using Google Drive. Clicking on the Colab logo at the
top left of the notebook view will show all notebooks in Drive. You can also search for
notebooks that you have opened recently using File > Open notebook.
A.9 Where is my code executed? What happens to my execution state if I close the browser
window?
Code is executed in a virtual machine private to your account. Virtual machines are
deleted when idle for a while, and have a maximum lifetime enforced by the Colab service.
A.10 How can I get my data out?
You can download any Colab notebook that you’ve created from Google Drive following
these instructions, or from within Colab’s File menu. All Colab notebooks are stored in the open
source Jupyter notebook format ( .ipynb).
A.11 How can I reset the virtual machine(s) my code runs on, and why is this sometimes
unavailable?
Selecting Runtime > Disconnect and delete runtime to return all managed virtual
machines assigned to you to their original state. This can be helpful in cases where a virtual
machine has become unhealthy e.g. due to accidental overwrite of system files, or installation of
incompatible software. Colab limits how often this can be done to prevent undue resource
consumption. If an attempt fails, please try again later.
A.12 Why does drive.mount() sometimes fail saying "timed out", and why do I/O
operations in drive.mount()-mounted folders sometimes fail?
Google Drive operations can time out when the number of files or subfolders in a folder
grows too large. If thousands of items are directly contained in the top-level "My Drive" folder
then mounting the drive will likely time out. Repeated attempts may eventually succeed as failed
attempts cache partial state locally before timing out.
If you encounter this problem, try moving files and folders directly contained in "My Drive" into
sub-folders. A similar problem can occur when reading from other folders after a successful
drive.mount(). Accessing items in any folder containing many items can cause errors like
OSError: [Errno 5]
Input/output error. Again, you can fix this problem by moving directly contained items into sub-
folders. Note that "deleting" files or subfolders by moving them to the Trash may not be enough;
if that doesn't seem to help, make sure to also Empty your Trash.
A.13 Why do Drive operations sometimes fail due to quota?
Google Drive enforces various limits, including per-user and per-file operation count and
bandwidth quotas. Exceeding these limits will trigger Input/output error as above, and show a
notification in the Colab UI. A typical cause is accessing a popular shared file, or accessing too
many distinct files too quickly. Workarounds include:
Copy the file using drive.google.com and don't share it widely so that other users don't use up its
limits.
Avoid making many small I/O reads, instead opting to copy data from Drive to the Colab VM in
an archive format (e.g. .zip or.tar.gz files) and unarchive the data locally on the VM instead of in
the mounted Drive directory.
Wait a day for quota limits to reset.

Assignment 9 Solutions
No ratings yet
Assignment 9 Solutions
5 pages
Biologically Inspired Optimization Methods An Introduction 1st Edition Mattias Wahde - The ebook is available for instant download, read anywhere
100% (1)
Biologically Inspired Optimization Methods An Introduction 1st Edition Mattias Wahde - The ebook is available for instant download, read anywhere
82 pages
Math Contract
No ratings yet
Math Contract
2 pages
Mat 1120 Tutorial Sheet 8 2022 23
No ratings yet
Mat 1120 Tutorial Sheet 8 2022 23
2 pages
Combining Classifiers With Different Footstep Feature Sets and Multiple Samples For Person Identification
No ratings yet
Combining Classifiers With Different Footstep Feature Sets and Multiple Samples For Person Identification
4 pages
Journallistofscopus PDF
No ratings yet
Journallistofscopus PDF
630 pages
Journallistofscopus PDF
No ratings yet
Journallistofscopus PDF
630 pages
sleepXAI
No ratings yet
sleepXAI
14 pages
Korbit Ai
No ratings yet
Korbit Ai
9 pages
Alkylation 2
No ratings yet
Alkylation 2
22 pages
S.No - Project Title Name of The Students Area of Specialization PEO PO
No ratings yet
S.No - Project Title Name of The Students Area of Specialization PEO PO
4 pages
Need For Avionics in Civil, Military and Space Systems
100% (1)
Need For Avionics in Civil, Military and Space Systems
2 pages
Mathematical Foundations For Machine Learning and Data Science
No ratings yet
Mathematical Foundations For Machine Learning and Data Science
25 pages
PEAS Description of Task Environment With Different Types of Properties
No ratings yet
PEAS Description of Task Environment With Different Types of Properties
13 pages
GEd 102 - Lesson 5
No ratings yet
GEd 102 - Lesson 5
13 pages
Programmable Simulations of Molecules and Materials With Reconfigurable Quantum Processors
No ratings yet
Programmable Simulations of Molecules and Materials With Reconfigurable Quantum Processors
39 pages
CT5 General Insurance Life
No ratings yet
CT5 General Insurance Life
7 pages
Duality
No ratings yet
Duality
28 pages
EIE4413
No ratings yet
EIE4413
4 pages
Spring2016 Sol PDF
No ratings yet
Spring2016 Sol PDF
27 pages
A10 Bracs&quads
No ratings yet
A10 Bracs&quads
2 pages
Design, Fabrication AND Analysis OF Rocker Bogie Mechanism: Mukt Shabd Journal ISSN NO: 2347-3150
No ratings yet
Design, Fabrication AND Analysis OF Rocker Bogie Mechanism: Mukt Shabd Journal ISSN NO: 2347-3150
15 pages
11.6 The QR Algorithm For Real Hessenberg Matrices: Eigensystems
No ratings yet
11.6 The QR Algorithm For Real Hessenberg Matrices: Eigensystems
8 pages
Print Order Details4
No ratings yet
Print Order Details4
1 page
bird species project report final
No ratings yet
bird species project report final
50 pages
Linear Algebra (Echelon Form of A Matrix)
No ratings yet
Linear Algebra (Echelon Form of A Matrix)
11 pages
Gudlavalleru Engineering College: Criteria 3.5 Process of Identifying Curricular Gaps To The Attainment of The Cos/Pos
No ratings yet
Gudlavalleru Engineering College: Criteria 3.5 Process of Identifying Curricular Gaps To The Attainment of The Cos/Pos
2 pages
Accessing Hardware: Code Club Pihack
No ratings yet
Accessing Hardware: Code Club Pihack
2 pages
Fundamentals of Algorithmic Problem Solving
No ratings yet
Fundamentals of Algorithmic Problem Solving
13 pages
Brain Tumor Dataset
No ratings yet
Brain Tumor Dataset
17 pages
Prasar Bharati 2013
No ratings yet
Prasar Bharati 2013
8 pages
A Visual Approach
No ratings yet
A Visual Approach
11 pages
A Visual Approach
No ratings yet
A Visual Approach
11 pages
DIGITAL SIGNAL PROCESSING Unit 6
No ratings yet
DIGITAL SIGNAL PROCESSING Unit 6
28 pages
Face Detection
No ratings yet
Face Detection
91 pages
Curvelet-Based Multiscale Denoising Using Non-Local Means & Guided Image Filter
No ratings yet
Curvelet-Based Multiscale Denoising Using Non-Local Means & Guided Image Filter
10 pages
GUI Implementation of Image Encryption and Decryption Using Open CV
No ratings yet
GUI Implementation of Image Encryption and Decryption Using Open CV
4 pages
Techniques and Antenna Radar Sys Anti Jam An D Methods For Adaptive Stem Polarization Optim ND Anti Clutter Applicati Single Mization For Ions
No ratings yet
Techniques and Antenna Radar Sys Anti Jam An D Methods For Adaptive Stem Polarization Optim ND Anti Clutter Applicati Single Mization For Ions
4 pages
Real Time Vehicle Monitoring and Tracking System For School Bus Via Beagle Bone
No ratings yet
Real Time Vehicle Monitoring and Tracking System For School Bus Via Beagle Bone
4 pages
Session 3
No ratings yet
Session 3
27 pages
Ijbi2021 5513500
No ratings yet
Ijbi2021 5513500
11 pages
DAY Wise Programs
No ratings yet
DAY Wise Programs
61 pages
DSP Important Questions Unit-Wise
100% (7)
DSP Important Questions Unit-Wise
6 pages
Adaptive Systems: Performance You Can See
No ratings yet
Adaptive Systems: Performance You Can See
8 pages
Underwater Image Enhancement
No ratings yet
Underwater Image Enhancement
42 pages
Linear Programming Notes From Roque
No ratings yet
Linear Programming Notes From Roque
2 pages
PID Tuning Tutor
No ratings yet
PID Tuning Tutor
4 pages
A New Hybrid Approach For Brain Tumor Classification Using BWT-KSVM
No ratings yet
A New Hybrid Approach For Brain Tumor Classification Using BWT-KSVM
6 pages
06748981
No ratings yet
06748981
10 pages
P S CK: Seshadri Rao Knowledge Village::Gudlavalleru
No ratings yet
P S CK: Seshadri Rao Knowledge Village::Gudlavalleru
2 pages
A Smart Wearable Guiding Device For The Visually Impaired People
No ratings yet
A Smart Wearable Guiding Device For The Visually Impaired People
7 pages
War Field Spying Robot Controlled by Raspberry Pi
No ratings yet
War Field Spying Robot Controlled by Raspberry Pi
7 pages
CS3691 Embedded Systems and IoT Lecture Notes 2
No ratings yet
CS3691 Embedded Systems and IoT Lecture Notes 2
128 pages
Compensation of feedback control systems 1
No ratings yet
Compensation of feedback control systems 1
22 pages
Eye Directive Wheelchair Seminar
No ratings yet
Eye Directive Wheelchair Seminar
18 pages
Satellite Image Classification
100% (1)
Satellite Image Classification
52 pages
IoT Based Biometric Voting System
No ratings yet
IoT Based Biometric Voting System
12 pages
RISC-VTF RISC-V Based Extended Instruction Set For Transformer
No ratings yet
RISC-VTF RISC-V Based Extended Instruction Set For Transformer
6 pages
CS3691 Embedded Systems and IoT Lecture Notes 2-4-128
No ratings yet
CS3691 Embedded Systems and IoT Lecture Notes 2-4-128
125 pages
Resistors: Fig: 1.1 Colour Coding On Resistor
No ratings yet
Resistors: Fig: 1.1 Colour Coding On Resistor
107 pages
Vision Based Lane Detection For Unmanned Ground Vehicle (UGV)
No ratings yet
Vision Based Lane Detection For Unmanned Ground Vehicle (UGV)
19 pages
Challenges and Hurdles
No ratings yet
Challenges and Hurdles
8 pages
IoT Based Smart Parking System
No ratings yet
IoT Based Smart Parking System
12 pages
IoT Based Smart Wheelchair
No ratings yet
IoT Based Smart Wheelchair
4 pages
Content-Based Fake News Detection With Machine and Deep Learning
No ratings yet
Content-Based Fake News Detection With Machine and Deep Learning
13 pages
DIP Notes Unit 5
No ratings yet
DIP Notes Unit 5
30 pages
SM Ch1
No ratings yet
SM Ch1
30 pages
Anthony Stafford Beer
No ratings yet
Anthony Stafford Beer
6 pages
Market Segmentation For Airlines
No ratings yet
Market Segmentation For Airlines
1 page
CSE 299 Group Proposal
No ratings yet
CSE 299 Group Proposal
4 pages
Anubhav
No ratings yet
Anubhav
43 pages
Scheduling 7 and Sequencing
No ratings yet
Scheduling 7 and Sequencing
25 pages
RIS Signal Processing
No ratings yet
RIS Signal Processing
32 pages
IoT Module 1
No ratings yet
IoT Module 1
29 pages
Omr351-Mechatronics Lesson Plan Orginal
No ratings yet
Omr351-Mechatronics Lesson Plan Orginal
5 pages
Ee3020 Ssa Iat2 QB
No ratings yet
Ee3020 Ssa Iat2 QB
1 page
Object Detection
No ratings yet
Object Detection
7 pages
LT-Fall The Design and Implementation of A Life-Threatening Fall
100% (1)
LT-Fall The Design and Implementation of A Life-Threatening Fall
24 pages
Assistive Smart Home System: A Project Report
No ratings yet
Assistive Smart Home System: A Project Report
32 pages
Adaptive Blind Noise Suppression in Some Speech Processing Applications
No ratings yet
Adaptive Blind Noise Suppression in Some Speech Processing Applications
5 pages
Iot Based Flood Alert System Using Node Mcu
No ratings yet
Iot Based Flood Alert System Using Node Mcu
10 pages
Wireless Sensor Networks
No ratings yet
Wireless Sensor Networks
211 pages
Design and Fabrication of IoT Enabled Wheelchair Cum Stretcher With Home Automation and Patient Monitoring System
No ratings yet
Design and Fabrication of IoT Enabled Wheelchair Cum Stretcher With Home Automation and Patient Monitoring System
4 pages
A Human Hand Gesture Based TV Fan Control System Using Open CV
No ratings yet
A Human Hand Gesture Based TV Fan Control System Using Open CV
99 pages
IPCV Unit 04
No ratings yet
IPCV Unit 04
12 pages
High Performance FPGA Based CNN Accelerator
No ratings yet
High Performance FPGA Based CNN Accelerator
4 pages
Sistec Major Project Report Microcontroller Enabled Speaking System For Deaf and Dumb
80% (5)
Sistec Major Project Report Microcontroller Enabled Speaking System For Deaf and Dumb
32 pages
A Smart Medicine Box For Medication Management Using IoT
No ratings yet
A Smart Medicine Box For Medication Management Using IoT
6 pages
Embedded Systems Application Areas
No ratings yet
Embedded Systems Application Areas
8 pages
ﺃﻤﺜل ﺜﺎﺒﺕ ﺍﻟﺘﻁﺒﻴﻕ ﺘﻤﻬﻴﺩ ﻟﺩﺍﻟﺔ ﺍﻟﺘﻤﻬﻴﺩ ﺍﻵﺴﻲ ﻤﻊ ﻓﺎﻀل ﻋﺒﺎﺱﺍﻟﻁﺎﺌﻲ
No ratings yet
ﺃﻤﺜل ﺜﺎﺒﺕ ﺍﻟﺘﻁﺒﻴﻕ ﺘﻤﻬﻴﺩ ﻟﺩﺍﻟﺔ ﺍﻟﺘﻤﻬﻴﺩ ﺍﻵﺴﻲ ﻤﻊ ﻓﺎﻀل ﻋﺒﺎﺱﺍﻟﻁﺎﺌﻲ
15 pages
Cyclic Encoding & Decoding
No ratings yet
Cyclic Encoding & Decoding
3 pages
Cmos Digital Ic Design Mid Paper
No ratings yet
Cmos Digital Ic Design Mid Paper
2 pages
2.development of Blind Assistive Device in Shopping Malls
No ratings yet
2.development of Blind Assistive Device in Shopping Malls
4 pages
Tele-Ecg Monitoring System Using Raspberry Pi: A Project Report
100% (1)
Tele-Ecg Monitoring System Using Raspberry Pi: A Project Report
37 pages
Day5 FDP IoT Part1
No ratings yet
Day5 FDP IoT Part1
89 pages
Ece Seminar Topics
No ratings yet
Ece Seminar Topics
17 pages
Final PPT of CCSDS TC System
0% (1)
Final PPT of CCSDS TC System
38 pages
Frequency Domain Filtering Image Processing
100% (1)
Frequency Domain Filtering Image Processing
24 pages
PBG BCI Robotics
No ratings yet
PBG BCI Robotics
5 pages
Smart Street Lighting Using Embedded Systems
No ratings yet
Smart Street Lighting Using Embedded Systems
6 pages
Robot Perception
No ratings yet
Robot Perception
19 pages
Automated Ambulance Final
No ratings yet
Automated Ambulance Final
49 pages
Unit 1 WSN
No ratings yet
Unit 1 WSN
139 pages
Theoretical and Practical Analysis On CNN, MTCNN and Caps-Net Base Face Recognition and Detection PDF
No ratings yet
Theoretical and Practical Analysis On CNN, MTCNN and Caps-Net Base Face Recognition and Detection PDF
35 pages
Deep Learning Report Waqas PDF
No ratings yet
Deep Learning Report Waqas PDF
16 pages
Vlsi Imlememt of Odfm
No ratings yet
Vlsi Imlememt of Odfm
10 pages
Mini Project Report
No ratings yet
Mini Project Report
15 pages
Human Hand Gestures Capturing and Recognition Via Camera
No ratings yet
Human Hand Gestures Capturing and Recognition Via Camera
64 pages
Robotics and Computer Vision in Swarm Intelligence and Traffic Safety
No ratings yet
Robotics and Computer Vision in Swarm Intelligence and Traffic Safety
10 pages
Unit 2 Notes
No ratings yet
Unit 2 Notes
99 pages
B20-ml Basedbotnet Attack in IoT Devices
No ratings yet
B20-ml Basedbotnet Attack in IoT Devices
66 pages
8051 Microcontroller
No ratings yet
8051 Microcontroller
60 pages
JNTUK M.Tech R13 CNC Syllabus
No ratings yet
JNTUK M.Tech R13 CNC Syllabus
18 pages
Fronczak, Reniers - Model-Based Systems Engineering - 4TC00 Dictaat 2014-2015 PDF
No ratings yet
Fronczak, Reniers - Model-Based Systems Engineering - 4TC00 Dictaat 2014-2015 PDF
105 pages
Mastering WebGL: Crafting Advanced 3D Web Experiences: WebGL Wizadry
From Everand
Mastering WebGL: Crafting Advanced 3D Web Experiences: WebGL Wizadry
Kameron Hussain
No ratings yet
Computer Aided Design of Electrical Machines
From Everand
Computer Aided Design of Electrical Machines
K.M. Vishnu Murthy
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Face Mask Detection

Uploaded by

Face Mask Detection

Uploaded by

CHAPTER 1

S.N Title Advantages Disadvantages

Fig 3.1: Convolution operation on image

Fig 3.3: Convolution operation on RGB image

Fig 3.5: Activation function

Fig 3.6: Pooling operation on image

Fig 3.7: Fully Connected layer

Fig 3.8: Dropout layer

4.1. Gradient Descent Deep Learning Optimizer

Figure 3.1. Gradient descent

4.2. Stochastic Gradient Descent Deep Learning Optimizer

To tackle the problems in gradient descent algorithm, we have stochastic gradient

4.3. Adagrad (Adaptive Gradient Descent) Deep Learning Optimizer

4.4. RMS Prop (Root Mean Square) Deep Learning Optimizer

4.5. AdaDelta Deep Learning Optimizer

Figure 5.1. Flow chart of the proposed method

2. Open a new Colab.

3. Open the (left)side panel, select Files view.

4. Click Mount Drive then Connect to Google Drive.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.