Face Detection Using Opencv and Python: A Beginner'S Guide
Face Detection Using Opencv and Python: A Beginner'S Guide
“Leave me alone.”
These words send a shiver down my spine. But then again, they are the only comfort I get when I
use Snapchat these days.
Look:
I don’t know about you, BUT I SURE AS HELL don’t enjoy sharing my bed with Casper or any
other creepy ghosts that this otherworld-R.S.V.P-app has brought to my life.
You see, every once in a while I’m doing my dog filter faces like a normal human being in 2017;
but then… my cat stops moving and stares at the end of the room… the camera refocuses… and
then: it finds an invisible Dalmatian filter standing by my side.
A buddy of mine told me to look into the field of Computer Vision … It’s a crazy field where
machines learn how to extract useful information from images as we humans do.
In a simple phrase it’s replicating the process of when you look at someone, and recognize them and
then say: “Hey empty void of solitude, how’s it going?” and then get no answer back because I have
no one in my life. Or when you are looking for your car in a parking lot, scan the place and find
your car at the farthest parking space available.
Whichever one, it’s the same process. Except you can teach this to a machine.
Knowing this, I’ve decided to venture on, and write a 2-part article with everything I’ve learned. I’ll
focus on face detection using OpenCV, and in the next, I’ll dive into face recognition.
Finally, decide whether I should stay put and keep on selfy-ing (word TM pending) online or have
to move once again.
This technology has been available for some years now and is being used all over the place.
From cameras that make sure faces are focused before you take a picture, to Facebook when it tags
people automatically once you upload a picture (before you did that manually remember?).
Or some shows like CSI used them to identify “bad guys” from security footage (ENHANCE! –
then insert crime pun) or even unlocking your phone by looking at it!
Attractive people… That's why I'm taking the picture
In short, how Face Detection and Face Recognition work when unlocking your phone is as
following:
You look at your phone, and it extracts your face from an image (the nerdy name for this process is
face detection). Then, it compares the current face with the one it saved before during training and
checks if they both match (its nerdy name is face recognition) and, if they do, it unlocks itself.
As you see, this technology not only allows me to be any type of dog I want. People are getting
pretty interested in it because of its ample applications. ATMs with Facial Recognition and Face
Detection software have been introduced to withdraw money. Also, Emotion Analysis is gaining
relevance for research purposes.
An ATM with a facial recognition system. Source: www.bloomberg.com
So if you’re thinking:
“I wish I could build my own ATM with facial detection or facial recognition.”
Then stop dreaming, you will be able to have a face detector at the end of this article (banking
system not included).
Both of these classifiers process images in gray scales, basically because we don't need color
information to decide if a picture has a face or not (we'll talk more about this later on). As these are
pre-trained in OpenCV, their learned knowledge files also come bundled with OpenCV
opencv/data/.
To run a classifier, we need to load the knowledge files first, as if it had no knowledge, just like a
newly born baby (stupid babies).
Each file starts with the name of the classifier it belongs to. For example, a Haar cascade classifier
starts off as haarcascade_frontalface_alt.xml.
These are the two types of classifiers we will be using to analyze Casper.
Now, all possible sizes of each window are placed on all possible locations of each image to
calculate plenty of features.
For example, in above image, we are extracting two features. The first one focuses on the property
that the region of the eyes is often darker than the area of the nose and cheeks. The second feature
relies on the property that the eyes are darker than the bridge of the nose.
But among all these features calculated, most of them are irrelevant. For example, when used on the
cheek, the windows become irrelevant because none of these areas are darker or lighter than other
regions on the cheeks, all sectors here are the same.
So we promptly discard irrelevant features and keep only those relevant with a fancy technique
called Adaboost. AdaBoost is a training process for face detection, which selects only those
features known to improve the classification (face/non-face) accuracy of our classifier.
In the end, the algorithm considers the fact that generally: most of the region in an image is a non-
face region. Considering this, it’s a better idea to have a simple method to check if a window is a
non-face region, and if it's not, discard it right away and don’t process it again. So we can focus
mostly on the area where a face is.
So, LBP features are extracted to form a feature vector that classifies a face from a non-face.
“But how are LBP features found?”
Each training image is divided into some blocks as shown in the picture below.
For each block, LBP looks at 9 pixels (3×3 window) at a time, and with a particular interest in the
pixel located in the center of the window.
Then, it compares the central pixel value with every neighbor's pixel value under the 3×3 window.
For each neighbor pixel that is greater than or equal to the center pixel, it sets its value to 1, and for
the others, it sets them to 0.
After that, it reads the updated pixel values (which can be either 0 or 1) in a clockwise order and
forms a binary number. Next, it converts the binary number into a decimal number, and that
decimal number is the new value of the center pixel. We do this for every pixel in a block.
LBP conversion to binary. Source: López & Ruiz; Local Binary Patterns applied to Face Detection
and Recognition.
Then, it converts each block values into a histogram, so now we have gotten one histogram for each
block in an image, like this:
LBP Histogram. Source: López & Ruiz; Local Binary Patterns applied to Face Detection and
Recognition.
Finally, it concatenates these block histograms to form a one feature vector for one image, which
contains all the features we are interested. So, this is how we extract LBP features from a picture.
Unfortunately, “You need balls and please stop calling,” doesn’t work for us.
So, in case more accurate detections are required, Haar classifier is the way to go. This bad boy is
more suitable in technology such as security systems or high-end stalking.
But the LBP classifier is faster, therefore, should be used in mobile applications or embedded
systems.
3.1. Dependencies
Let's first install the required dependencies to run this code.
Note: If you don't want to install matplotlib, then replace matplotlib code with OpenCV code as
shown below:
Instead of:
plt.imshow(gray_img, cmap='gray')
Please keep these functions in mind, as I will use them in the following code.
When you load an image using OpenCV, it loads it into BGR color space by default. To show the
colored image using matplotlib we have to convert it to RGB space. The following is a helper
function to do exactly that:
def convertToRGB(img):
return cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
cv2.cvtColor is an OpenCV function to convert images to different color spaces. It takes as input an
image to transform, and a color space code (like cv2.COLOR_BGR2RGB) and returns the
processed image.
Now that we are all setup let's start coding our first face detector: Haar.
For this, we’ll need to keep at arms reach this very handy function cv2.cvtColor (converting
images to grayscale).
This step is necessary because many operations in OpenCV are done in grayscale for performance
reasons.
To display our image, I’ll use the plt.imshow(img, cmap) function of matplotlib.
Before we can continue with face detection, we have to load our Haar cascade classifier.
OpenCV provides us with a class cv2.CascadeClassifier which takes as input the training file of the
(Haar/LBP) classifier we want to load and loads it for us. Easy-breezy-Covergirl.
Since we want to load our favorite (for now): the Haar classifier, its XML training files are stored
in the opencv/data/haarcascades/ folder. You can also find them in the data folder of the Github
repo I’ll share with you at the end of this article.
Now, how do we detect a face from an image using the CascadeClassifier we just loaded?
Well, again OpenCV's CascadedClassifier has made it simple for us as it comes with the function
detectMultiScale, which detects exactly that. Next are some details of its options/arguments:
There are other parameters as well, and you can review the full details of these functions here.
These parameters need to be tuned according to your own data.
Now that we know a straightforward way to detect faces, we could now find the face in our test
image.
The following code will try to detect a face from the image and, if detected, it will print the number
of faces that it has found, which in our case should be 1. Only 1 since no other spiritual being is out
there. Right?…
Faces found: 1
Woohoo! We found our face! And it’s only one beautiful face at a time…
Next, let's loop over the list of faces (rectangles) it returned and drew those rectangles using yet
another built-in OpenCV rectangle function on our original colored image to see if it found the
right faces:
Let’s display the original image to see the rectangles we just drew and verify that detected faces are
real ones and not any false positives.
Unfortunately, this code is more scattered than my attention span during high school. Let’s turn this
into a function that is completely reusable.
Inside the function, first I made a copy img_copy of the passed image. This way we'll do all of the
operations on a copy and not the original image. Then, I converted the copied image img_copy to
grayscale, as our face detector expects a grayscale image.
After that, I called the detectMultiScale function of our CascadeClassifier to return the list of
detected faces (which is the list of rectangles Rect(x, y, w, h)).
Once we have the list of recognized faces, I loop over them and draw a rectangle on the copy of
the image. In the end, I return the modified copy of the picture.
So, that code is pretty much the same as before; it’s just grouped inside a function for reusability.
Before I start packing everything I own to move to exactly-anywhere-but-here, let me check this
once again. What could’ve gone wrong?
Oh ok, some faces may be closer to the camera, and they would appear bigger than those faces in
the back, this was scaring the bejeezus out of me.
A simple tweak to the scale factor compensates for this so can move that parameter around. For
example, scaleFactor=1.2 improved the results.
Please, remember to tune these parameters according to the information you have about your data.
From a coding perspective, you don't have to change anything in our face detection code except,
instead of loading the Haar classifier training file you have to load the LBP training file, and rest
of the system stays the same.
I can also try as many detectors (eye detector, smile detector, etc.) as you want, without changing
much of the code (leave a comment if you’d like to build one of those… Ghosts: No comments
allowed!)
Who's this handsome fellow? I'm 97% sure he didn't write this article as a pick-up method…
As you can see, since the code is exactly same, I just loaded up our CascadeClassifier this time with
the LBP training file. I read a test image and called our detect_faces function, which returned an
image with a face drawn on it.
It's a treble!
No big deal! I’ll run both Haar and LBP on two test images to see accuracy and time delay of
each.
I loaded both Haar and LBP classifiers and two test images: test1 and test2.
4.1. Test 1
I’ll try both classifiers on test1 image:
#------------HAAR-----------
#note time before detection
t1 = time.time()
#call our function to detect faces
haar_detected_img = detect_faces(haar_face_cascade, test1)
#note time after detection
t2 = time.time()
#calculate time difference
dt1 = t2 - t1
#print the time difference
I have used Python library function time.time() to keep track of time. So before I start finding
faces on our test image, I'll note the start time t1, and then I call our function detect_faces.
Then, I'll establish end time t2. The difference between start time t1 and end time t2, dt1, is what we
are interested in as it ‘s the time taken by our face detector to detect faces.
#------------LBP-----------
#note time before detection
t1 = time.time()
#call our function to detect faces
lbp_detected_img = detect_faces(lbp_face_cascade, test1)
#note time after detection
t2 = time.time()
#calculate time difference
dt2 = t2 - t1
#print the time difference
Now that we have the face detected images and measured the time both of our face detectors took,
let's see how they stack up against each other!
I am going to use matplotlib function subplots(rows, cols, figsize) to display the results side by side.
First, the time difference as a title of each subplot (image window).
4.2. Test 2
Let's see the results for test2 image. The code is the same as for test1:
#------------HAAR-----------
#note time before detection
t1 = time.time()
#call our function to detect faces
haar_detected_img = detect_faces(haar_face_cascade, test2)
#note time after detection
t2 = time.time()
#calculate time difference
dt1 = t2 - t1
#print the time difference
#------------LBP-----------
#note time before detection
t1 = time.time()
#call our function to detect faces
lbp_detected_img = detect_faces(lbp_face_cascade, test2)
#note time after detection
t2 = time.time()
#calculate time difference
dt2 = t2 - t1
#print the time difference
Almost ready…
Actually: Kinda ready to figure out who or what has been trying to get followers through my pics.
I’m halfway there, so don’t miss out on how this crazy adventure unfolds.
Little miss-creep-a-lot better think again before she (or he, it’s 2017) pops in any of my selfies ever
again.
You can download the complete code we used in this face detection tutorial from this repo along
with test images and LBP and Haar training files.
Let’s all thank OpenCV for allowing the implementation of the above-mentioned algorithms and
making our life so much easier.