0% found this document useful (0 votes)
2 views

Enhancement of eye socket recognition performance using

This study presents a novel eye socket recognition method that utilizes inverse histogram fusion images and Gabor features to enhance recognition accuracy. The method was evaluated using four benchmark datasets and demonstrated significantly higher accuracy (over 92.18%) compared to traditional methods and state-of-the-art deep learning models. The findings suggest that this approach could improve identity recognition systems, particularly in challenging conditions such as partial face coverings.

Uploaded by

storyminer1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Enhancement of eye socket recognition performance using

This study presents a novel eye socket recognition method that utilizes inverse histogram fusion images and Gabor features to enhance recognition accuracy. The method was evaluated using four benchmark datasets and demonstrated significantly higher accuracy (over 92.18%) compared to traditional methods and state-of-the-art deep learning models. The findings suggest that this approach could improve identity recognition systems, particularly in challenging conditions such as partial face coverings.

Uploaded by

storyminer1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Received: 21 September 2023 Revised: 3 June 2024 Accepted: 12 July 2024

DOI: 10.4218/etrij.2023-0395

ORIGINAL ARTICLE

Enhancement of eye socket recognition performance using


inverse histogram fusion images and the Gabor transform

Harisu Abdullahi Shehu 1 | Ibrahim Furkan Ince 2 | Faruk Bulut 3,4

1
School of Engineering and Computer
Science, Victoria University of Abstract
Wellington, Wellington, New Zealand The eye socket is a cavity in the skull that encloses the eyeball and its sur-
2
Department of Software Engineering, rounding muscles. It has unique shapes in individuals. This study proposes a
Istinye University, Istanbul, Turkey
3
new recognition method that relies on the eye socket shape and region. This
Department of Computer Engineering,
Istanbul Esenyurt University, Istanbul, method involves the utilization of an inverse histogram fusion image to gener-
Turkey ate Gabor features from the identified eye socket regions. These Gabor features
4
School of Computer Science and are subsequently transformed into Gabor images and employed for recognition
Electronic Engineering, University of
by utilizing both traditional methods and deep-learning models. Four distinct
Essex, Colchester, England
benchmark datasets (Flickr30, BioID, Masked AT & T, and CK+) were used to
Correspondence evaluate the method’s performance. These datasets encompass a range of per-
Harisu Abdullahi Shehu, School of
Engineering and Computer Science,
spectives, including variations in eye shape, covering, and angles. Experimen-
Victoria University of Wellington. tal results and comparative studies indicate that the proposed method achieved
Email: harisushehu@ecs.vuw.ac.nz a significantly (p < 0:001) higher accuracy (average value greater than 92.18%)
Funding information than that of the relevant identity recognition method and state-of-the-art deep
None reported. networks (average value less than 78%). We conclude that this improved gen-
eralization has significant implications for advancing the methodologies
employed for identity recognition.

KEYWORDS
classification, deep learning, eye socket, Gabor features, identity recognition, image
matching, vector quantization

1 | INTRODUCTION iris images in practical settings is challenging. Obtaining


close-up images is imperative for precise detection
Biometric identification methods are becoming increas- because the intricate patterns within the iris contain cru-
ingly popular in various applications and allow the recog- cial details. The overall performance of an iris detection
nition of individuals based on their distinct physical system relies on accurate localization and segmentation
characteristics. These biometrics encompass a broad of the iris from an eye image as well as the resolution of
range of features, including voice, ear, palmprint, finger- the image itself.
print, face, iris, retina, hand geometry, posture, and walk- Eye socket recognition, also referred to as eye region
ing style. Iris detection has gained significant attention detection or eye localization, is a computer vision tech-
because of its unique and unchanging nature throughout nique used to identify and determine the position and
a person’s lifetime. However, capturing high-resolution shape of an eye region within an image or video frame [1].

This is an Open Access article distributed under the term of Korea Open Government License (KOGL) Type 4: Source Indication + Commercial Use Prohibition +
Change Prohibition (http://www.kogl.or.kr/info/licenseTypeEn.do).
1225-6463/$ © 2024 ETRI

ETRI Journal. 2025;47(1):123–133. wileyonlinelibrary.com/journal/etrij 123


124 SHEHU ET AL.

This technique has been applied in various domains, have been conducted on this topic. For example, Min-
including gaze tracking [2], face recognition [3], emotion Allah and others [1] conducted a comprehensive investi-
detection [4], eye state recognition [5], gaze tracking [6], gation of recent research on pupil and iris recognition.
and exigency detection [7]. Although these methods have achieved state-of-the-art
The eye region is an intricate structure that encom- performance, empirical findings indicate that obstacles,
passes the eye itself, the eyebrows, and the surrounding such as contact lenses, closed eyes, eye diseases, light
area, known as the eye socket. The eye socket region con- reflections, and moments of pupil constriction or dila-
sists of elements such as the eyelashes, eyelids, and tion, can hamper their effectiveness [10]. Similarly,
sclera, as illustrated in Figure 1. Specifically, eye socket Mahanama and others [11] suggested that many studies
recognition entails the detection of the position and size rely on trend analysis or first-order statistical features,
of the eye socket in an image. This information was sub- such as the minimum, maximum, and skewness, rather
sequently used to identify and analyze accurately the eye than employing advanced measures, such as the eye
regions. region. These features, which are based on color, are
To detect accurately the eye socket, the initial step expected to exhibit decreased performance and potential
involves the segmentation of the eye image into distinct racial bias when identifying individuals of different races
parts to extract the annular region located between the from those used to train the model. Moreover, the color
sclera and pupil. Failure to identify these regions pre- of one’s eyes does not serve as the sole basis for recogniz-
cisely can lead to inaccurate outcomes in the identifica- ing one’s identity.
tion process. Therefore, before proceeding with eye Conversely, a real-time vision-based system for eye
socket matching, it is crucial to localize effectively and state recognition was proposed using a dual CNN ensem-
segment the eye socket regions, including the inner ble over eye patches [12]. Quality-based multimodal eye
boundary (pupil) and outer boundary (sclera along with recognition systems that utilize the entire eye, scleral
the eyelids and eyelashes). areas, and iris for quality measurements within the recog-
Eye socket recognition can be accomplished using nition system have been presented [13]. These multi-
various techniques, including feature-based methods and modal systems enhance the accuracy of biometric
deep-learning algorithms [8]. Feature-based methods recognition systems.
involve the extraction of specific attributes or patterns, Many studies, including those mentioned above,
such as color, texture, and shape, from images that are have primarily focused on the full facial region, where
known to be associated with the eye region. These the face is detected before extracting the eye region.
extracted features were then utilized to train a machine However, the use of partial facial coverings, such as face
learning model to recognize the eye region in the new masks, has added to the existing challenges of facial
images. Conversely, deep-learning algorithms employ identification, especially given the recent impact of the
convolutional neural networks (CNNs) for eye socket rec- coronavirus [14]. For example, the impacts of face masks
ognition. CNNs are specialized artificial neural networks on facial identity, gender, age, and emotion recognition
designed to process images by extracting relevant features were assessed using a single dataset [15]. Carragher and
and patterns. To train a CNN for eye socket recognition, a others examined facial recognition and perceptual face-
sizable dataset of labeled images was utilized, wherein matching systems and compared the performances of
each image was annotated according to the location and human observations and deep-neural networks (DNN)
size of the eye region. Once the CNN is trained, it can on the faces with surgical face masks [16]. Shehu and
accurately predict the location and size of the eye region others [17] and Noyes and others [18] compared the
in new images. familiarity, unfamiliarity with face matching, and classi-
Various types of biometric studies have been con- fication of emotions of individuals wearing face masks
ducted to explore gaze, iris, pupil, and emotional detec- and sunglasses. These studies predominantly explored
tion, among other areas of interest [9]. Several studies the effects of partial face coverage on human recognition
abilities and automated methods.
By contrast, a multimodal methodology for facial
expression recognition using face masks was proposed
utilizing deep-learning techniques [19]. This methodol-
ogy addresses the critical step in face reading when the
lower portion of the face is covered. Freud and others
examined how the use of face masks alters face percep-
tion, providing qualitative and quantitative evidence of
FIGURE 1 Basic parts of an eye and the full eye socket. changes in masked face processing that can have
SHEHU ET AL. 125

meaningful effects on daily life and social interac- more dependable means of identification by considering
tions [20]. Furthermore, several studies employing the entire eye socket area. By leveraging the distinctive
deep-learning-based systems in the Internet- shape and surroundings of the eye socket, this method
of-Things [21], ensemble deep-transfer learning [22], and has the potential to offer a more universal approach to
YOLO models in public places for real-time detection allow recognition applicable to individuals with diverse
systems [23] have been conducted. These studies physical characteristics.
highlighted the decrease in accuracy observed in auto- The contributions of this study are the following:
mated recognition systems when the face is partially cov-
ered, such as in facemasks. However, no attempts have 1. Propose a novel eye socket recognition method that
been expended to improve the accuracy of automated rec- employs inverse histogram fusion within the Gabor
ognition systems. transform. This method, which considers the unique
Inspired by an iris recognition system that employs shape of the eye socket and its surrounding region, is
uniform histogram fusion images and deformable circu- expected to provide advantages over existing iris-based
lar hollow kernels [24], this paper presents a novel recog- recognition systems. Specifically, it aims to improve
nition system that utilizes the eye socket region. This accuracy in identifying individuals with obscured or
method involves the generation of Gabor features from damaged irises.
the identified eye socket regions using an inverse histo- 2. Analyze the generalizability of the proposed method
gram fusion image (IHFI) and the conversion of these using four different in-the-wild facial recognition
features into Gabor images for recognition, with or with- datasets: (i) BioID [25] dataset with uncovered faces,
out the integration of machine learning modules. The (ii) MaskedAT&T [26] dataset with partially covered
rationale behind using this specific image type stems faces using face masks, (iii) Flickr30 [27] dataset with
from the challenge of differentiating numerous unique variations such as makeup, glasses, or eye infections,
patterns distributed throughout the iris, which can be dif- and (iv) CK+ [28] dataset with individuals exhibiting
ficult to distinguish effectively. different facial expressions.
Conventional recognition systems often rely on 3. Highlight the advantages of the proposed method over
detailed and high-resolution images, particularly for iris the Gabor method and several state-of-the-art deep-
recognition, which requires capturing intricate patterns learning approaches, including ResNet50 [29], Incep-
within the iris. However, this approach has practical limi- tionV3 [30], VGG19 [31], face mesh deep neural net-
tations in terms of consistently capturing high-resolution work (FaceMesh_DNN) [32], and multitask network
images, particularly from a distance or in less-controlled (MTN) [33], across datasets of varying difficulty. These
environments. Inspired by an iris recognition system that findings underscore the efficacy and robustness of the
employs uniform histogram fusion images and deform- proposed approach in diverse and challenging
able circular hollow kernels [24], this paper presents a scenarios.
novel recognition system that utilizes the eye socket
region. This method involves the generation of Gabor fea- The remainder of this paper is organized as follows.
tures from the identified eye socket regions using an IHFI Section 2 describes the proposed method in detail, and
and the conversion of these features into Gabor images Section 3 provides comprehensive information on the
for recognition, with or without the integration of experimental work, including hardware specifications,
machine learning modules. utilized datasets, and the obtained results. Section 4 pre-
The rationale behind using this specific image type sents a thorough discussion of the findings and potential
stems from the challenge of differentiating numerous avenues for future research. Finally, Section 5 concludes
unique patterns distributed throughout the iris, which the paper.
can be difficult to distinguish effectively. It modifies the
image histogram to emphasize the most frequent features
within the eye socket region, countering the issues of 2 | PROPOSED METHOD
skewed histograms owing to predominant skin colors.
This enhanced the visibility and distinctiveness of the Before introducing the proposed method, it is essential to
features necessary for accurate recognition. address the potential limitations of eye-and-eye socket
The primary objective of this study was to propose a recognition systems. These considerations are crucial for
robust and innovative eye socket recognition method that a comprehensive understanding of the proposed method,
focuses on the shape and surrounding regions of the eye as illustrated in Figure 2. The system operates through
socket. Unlike traditional iris-based systems that rely three distinct phases: region-of-interest (ROI) detection,
solely on the iris region, the proposed method provides a IHFI, and recognition.
126 SHEHU ET AL.

FIGURE 2 Flowchart of the proposed method.

F I G U R E 5 Photographs of a person’s face expressing different


emotions: (A) kiss, (B) crying, (C) sad, (D) winking, (E) astonished,
(F) disappointed, (G) tongue, (H) angry, (I) grinning, and
(A) (J) smiling [35].

(B)

F I G U R E 3 Comparison of images before and after eye


makeup: (A) before and (B) after [34].
(A) (B) (C) (D) (E)

F I G U R E 6 Images acquired from different angles of the same


person:(A) down, (B) up, (C) 45 ∘ R, (D) 45 ∘ L, and (E) front.

(A) (B) (C) (D) (E)

Views of a person from different angles are shown in


(F) (G) (H) (I) (J) Figure 6. In these cases, the recognition system fails.
In experimental studies, a woman could cover her
F I G U R E 4 (A) Glasses reflecting specular light, face with her hands in some photographs; similarly, indi-
(B) colorimetric glasses, (C, D) eyeglass frame preventing the eye viduals could bow their heads. Individuals wearing sun-
socket from being seen, (E) hair partially obstructing the eye, (F) a glasses were excluded. However, this issue must not be
partially open eye and eye blinking, (G) closed eyes, (H) nearly addressed. If eyes were not visible, they were excluded
closed eye, (I, J) eye illnesses. from the dataset. Contact lenses do not affect the pro-
posed system because they are placed on the iris.

2.1 | Limitations
2.2 | Eye socket extraction
Several barriers affect the effectiveness of the detection
systems. Significant limitations that can affect the perfor- The key aspect of a system that recognizes human iden-
mance of an eye socket recognition system include tity from the eye region is the accurate identification of
(i) makeup, (ii) glasses and varying light conditions, the eye sockets. This can be achieved by using Haar-like
(iii) illnesses, (iv) emotions, and (v) different angles. object detectors, which were originally introduced by
Figure 3 presents a sample showing how the eye socket Viola [36] and subsequently improved by Lienhart [37].
can be influenced by these factors in different individuals These detectors enable a classifier trained with sample
before and after applying eye makeup. views of an object to detect the object in an entire image.
The use of eyeglasses, the presence of light, and eye This method offers speed, efficiency, and accuracy when
illnesses can also serve as restrictions that can affect the implemented appropriately. It is particularly effective for
eye socket (see Figure 4). detecting objects that are partially obscured or when
Different views of the emotions of the same person video frames are noisy. The OpenCV library provides
are shown in Figure 5. Each prominent emotional state fully trained eye region Haar-cascade descriptors that
directly affected the shape of the eye socket. were utilized in our system.
SHEHU ET AL. 127

pffiffiffiffiffiffiffiffi
2.3 | IHFI k¼ j  i, ð3Þ

This study proposes a new conversion method known as


IHFI. This approach is based on the fact that the iris con- where k represents the geometric mean, j denotes the
tains numerous unique patterns, but the surrounding eye resulting value, and i denotes the original pixel intensity
socket region is dominated by skin color, leading to a value.
skewed histogram. To address this issue, an eye socket The geometric mean operation was performed to
image was first inverted in terms of color, and a probabil- adjust the pixel intensity values in relation to the likeli-
ity density function (PDF) was used to measure the fre- hood of occurrence in the eye socket image. This proce-
quency distribution of the image. An inverse PDF (which dure is effective in reducing information loss that may
maps the intensity values back to the intensity levels) is occur during the conversion process. The geometric
then created from the intensity histogram of each color mean ensures that the pixel intensity values are properly
channel (red, green, blue, and alpha) of the eye socket calibrated according to their probabilistic distribution in
image, as shown in (1). The resulting IHFI image cap- the image. Equations (4) and (5) are the extended
tures the most probable and frequent features within the versions of (3).
eye socket region, which are the iris patterns. This IHFI
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
image was then used to generate Gabor features, which k¼ px ðiÞ  i  i, ð4Þ
were compared with a dataset using normalized cross-
correlation for identity recognition. pffiffiffiffiffiffiffiffiffiffi
k ¼i px ðiÞ, ð5Þ
nL1i
px ðiÞ ¼ , 0 ≤ i ≤ L  1, ð1Þ
n where k is the final histogram image created from the
square root of the PDF of the eye socket. The square root
where px ðiÞ represents the probability of the pixel operation is expected to produce better results in classify-
intensity i, n denotes the total number of pixel intensities, ing natural images, such as eye socket images, compared
L is the maximum pixel intensity value, and i is the pixel with conventional probability methods because it allows
intensity value, where 0 ≤ i ≤ L  1. for natural growth. However, the resulting histogram
Equation (2) shows the calculation of the PDF value may not be uniform and may be difficult for the human
px ðiÞ, which is related to the occurrence probability of an eye to perceive. This makes them unsuitable for classifi-
intensity level for each color channel of the eye socket cation tasks in which uniformity is important. To address
image. The maximum color intensity value is represented this issue, a standard histogram equalization operation
by L. To create a probability image for each channel, the was applied separately to each channel of the histogram
PDF value of each pixel is multiplied by its intensity image to achieve uniformity. Figure 7A, B presents the
value. This process was performed separately for each frequencies of the histograms extracted from the same
pixel in the image for each color channel (red, green, image (presented in Figure 8) before and after histogram
blue, and alpha). equalization. The histogram equalization operation
employs the cumulative density function (CDF), which is
j ¼ px ðiÞ  i, ð2Þ calculated from the PDF value of each pixel in the histo-
gram image k, as expressed by (6),
where j ¼ px ðiÞ  i, j represents the resulting value, px ðiÞ
represents the probability of the pixel intensity i, and i is X
k

the pixel intensity value. cdf x ðkÞ ¼ px ðlÞ, ð6Þ


l¼0
Considering that there was only one white-colored
pixel in the eye socket image, the resulting PDF value for
the white color would be nearly zero; this would create where cdf x ðkÞ represents the cumulative distribution
an intensity value near zero in the probability image cre- function (CDF) of the pixel intensity up to k and px ðlÞ
ated from it. This scenario represents the maximum represents the probability of pixel intensity l. Summation
information loss, where the white color almost turns was performed from l ¼ 0 to k.
black. To prevent this, the geometric means of the origi- It is important to note that standard histogram
nal pixel intensity value and the pixel intensity value of equalization methods may lead to over- or under-
the conventional probability image were calculated. This enhancements and may produce visual artifacts. By con-
method aims to avoid losing information in worst-case trast, adaptive histogram equalization methods can
scenarios. The corresponding (3) is given as follows: achieve better results. In this study, a newly introduced
128 SHEHU ET AL.

y0 ¼ y  ðmaxðxÞ  minðxÞÞ þ minðxÞ, ð7Þ

where y0 represents the mapped value, y denotes the orig-


inal value, maxðxÞ is the maximum value in the original
range of x, and minðxÞ is the minimum value in the origi-
nal range of x.
These steps between (1) and (7) were performed for
each image channel separately, and four different single-
channel uniform histogram images (for red, green, blue,
and alpha channels) were created. The generated images
were merged into a single final image to create a four-
(A) channel RGBA output.

2.4 | Gabor filter

Gabor filters are a type of linear filter commonly used in


image processing and computer vision applications. They
are named after Dennis Gabor, a Hungarian physicist
who won the Nobel Prize in Physics in 1971 for his work
on holography. Gabor filters are banks of filters, each of
which is tuned to a specific frequency and orientation.
They are designed to mimic the receptive areas of simple
cells in the visual cortex. By convolving an image with a
(B)
bank of Gabor filters, features such as edges, textures,
F I G U R E 7 Histograms of the image presented in Figure 8
and corners can be extracted. Gabor filters are often used
(A) Before and (B) after histogram equalization. in applications such as face, fingerprint, and object recog-
nition. They have the advantage of capturing both local
and global features of an image, making them a powerful
tool for feature extraction. However, they can be compu-
tationally expensive, especially when using a large
number of filters. In (8) and (9), the two-dimensional
(2D) Gabor filters are expressed as follows:
F I G U R E 8 Extraction of the Gabor feature images for the
same people shown in Figure 6 (from left to right). ðj2 þ k2 Þ
Gc ½j,k ¼ B  e  cosð2πf ðjcos θ þ k sin sin θÞÞ, ð8Þ
2σ 2

method called low-dynamic range histogram equalization


was used for contrast enhancement because it is expected where Gc ½j, k represents the real part of the Gabor filter
to provide better results. However, because the main goal response at position ðj, kÞ, B is a constant, e is the base of
of this study was to present a robust approach for iris rec- the natural logarithm, σ is the standard deviation of the
ognition, additional studies can be performed to explore Gaussian envelope, f is the spatial frequency of the sinu-
the possibility of implementing fine-tuning procedures, soidal plane wave, and θ is the orientation of the sinusoi-
such as using adaptive HE methods. After obtaining the dal plane.
CDF vector, which consists of 256 bins in the range
[0, 255], an inverse transformation was employed using a
particular transformation function y ¼ TðxÞ, where T is
the number of normalized intensity levels and y is the
generated uniform histogram image (see Figure 9). Sub-
sequently, CDF was linearized across the value range,
that is, cdf y ðlÞ ¼ l  K for some constant K. Finally, a lin- F I G U R E 9 Extraction of the inverse histogram fusion images
ear mapping operation was performed to map the values (IHFI) inside the eye socket boundaries of two people from left to
back into their original range using (7), right. The eye socket images are in circled in the black region.
SHEHU ET AL. 129

ðj2 þ k 2 Þ is represented by the symbol “-.” Although the resolution


Gs ½j, k ¼ C  e  cosð2πf ðjcos θ þ k sin sin θÞÞ, ð9Þ was low, the recognition system performed well. Each
2σ 2
image in the dataset had a different resolution. When the
where Gs ½j, k represents the imaginary part of the Gabor rectangular eye pair part is extracted by ROI analysis,
filter response at position ðj, kÞ. the area becomes almost too small and prohibits the rec-
The use of Gabor filters in eye socket recognition pro- ognition of the eye sockets efficiently.
vides a unique description of texture compared with Datasets were divided into test and training sets
other filtering algorithms, such as bilateral filtering, (80–20 split). Subsequently, the features produced using
guided filtering, and wavelet-based filtering. Other fea- Gabor and histogram equalization were fed to a DNN
ture extraction methods may extract features, but they do learner for classification. The model was set up to run for
not provide a unique description, unlike Gabor features 21000 epochs, the images were resized to 28  28 pixels,
in the Gabor space, which can be utilized for eye socket and normalization was applied to enable fast
pattern recognition. Once the eye socket region is seg- computation.
mented and Gabor features are extracted, the subsequent
step is to normalize these features to generate iris codes
for comparison. Normalization is necessary because there 3.3 | Experimental results
are variations in the eye socket location, size, and orien-
tation among individuals, which require a common rep- In Table 2, the column “dataset” represents the datasets
resentation with similar dimensions. The normalized used in the experiment, the column “proposed” repre-
Gabor features of each eye socket produce a sample sents the accuracy achieved by the proposed method
Gabor image, as shown in Figure 8. (i.e., Gabor features and histogram equalization), and the
Finally, the obtained Gabor images of the eye socket column “Gabor” represents the accuracy achieved by
were employed for eye socket recognition by comparing the method using only Gabor features without histogram
the test images with the training images using normal- equalization.
ized cross-correlation. In addition, these images were Owing to the nondeterministic mechanism of deep
used in deep learning. networks and the stochastic nature of the processes, the
results obtained using the proposed method were
obtained using 30 independent runs. However, the results
3 | EXPERIMENTAL WORK presented are those obtained by the best model indicating
the upper and lower bounds of the 95 % confidence inter-
3.1 | Hardware specification val. Conversely, because the Gabor method is determinis-
tic, only a single result is presented.
A 24 GB graphical processing unit device (Nvidia RTX
A5000) with the CUDA Toolkit (v11.3) was used through-
out the experiments.

3.2 | Datasets and experimental design


F I G U R E 1 0 Sample images of an exemplar participant from
In this study, four popular benchmarking datasets were the CK+ dataset. As shown, different emotions affect the shape of
used, as listed in Table 1. All datasets were obtained from the eye region. Note that images from Flickr30, BioID, and Mask
the wild (Figure 10). Unknown/unprovided information AT & T are not shared owing to copyright reasons.

TABLE 1 Datasets used for eye socket recognition.

Name No. of individuals No. of images/individual No. of images Resolution Specifications


BioID 23 [2, 150] 1526 384  286 Frontal face images
MaskedAT&T - - - - Frontal face images with masks
Flickr30 30 - - 500  500 Frontal face images
CK+ 123 [10, 60] 5876 640  480 Frontal face images
640  490 with different emotional
expressions
130 SHEHU ET AL.

TABLE 2 Results received by the presented method and other state-of-the-art methods on each dataset.

Dataset Proposed Gabor Bilateral Guided Wavelength


BioID 100.0  0.0 90.16 93.77 97.05 96.72
MaskedAT&T 75.0  3.9 59.32 31.25 13.75 41.25
Flickr30 93.75  3.3 69.23 90.0 96.67 93.33
CK+ 100.0  0.0 90.46 100.0 100.0 100.0

TABLE 3 Comparison of the proposed method with state-of-the-art deep networks on each tested dataset.

Dataset Proposed ResNet50 InceptionV3 VGG19 FaceMesh_DNN [32] MTN [33]


BioID 100.0  0.0 83.07  1.6 # 66.54  3.3 # 81.89  1.1 # 97.92  0.4 # 97.74  0.2 #
MaskedAT&T 75.0  3.9 28.81  9.0 # 37.29  3.6 # 35.59  4.6 # 71.88  5.6 ! 56.25  1.0 #
Flickr30 93.75  3.3 48.08  3.9 # 28.85  4.0 # 50.0  4.9 # 99.89  0.6 " 61.62  19.03 #
CK+ 100.0  0.0 76.53  1.0 # 78.59  0.9 # 76.67  0.8 # 100.0  0.0 ! 100.0  0.0 !

As shown in Table 2, results obtained (from all the the term “"” implies that the compared method achieved
datasets used) by the proposed method outperformed significantly higher accuracy compared with the pro-
the results obtained by the Gabor method. Moreover, we posed method.
performed a statistical test (two-sample unpaired t-test) As shown in Table 3, the performance of the proposed
to determine the significance of the differences. We method is better than that of the state-of-the-art deep
found that the accuracy achieved by the proposed networks. The post-hoc, two-sample t-test showed that
method outperformed that of the Gabor method the accuracy achieved by the proposed method was
(p < 0:001) on all four datasets and the bilateral, guided, significantly higher (all p’s < 0.001) than the accuracy
and wavelength filtering-based methods in most cases by achieved by the state-of-the-art deep networks on all four
a significant amount. datasets.
Initially, analysis of variance was performed to test
the significance of the interaction between the results.
We found a significant main effect for the method [F (5, 4 | DISCUSS I ONS A ND
434) = 81.24, p < 0:001, η2 = 0.65] and dataset [F (3, 434) FUTURE STUDY
= 236.92, p < 0:001, η2 = 0.82], indicating that the
achieved accuracies vary across methods and datasets. The proposed eye socket recognition system based on the
We also found a significant interaction between the IHFI (Figure 9) and Gabor features has yielded promising
method and dataset [F (12, 434) = 117.81, p < 0:001, η2 = results in the evaluation using some benchmark datasets.
0.87], which indicates differences in the mean accuracies Accordingly, it is expected that some additional studies
among the methods and datasets. will be conducted in this domain in the future. The
A post-hoc, two-sample, t-test using Bonferroni cor- performance evaluation of any recognition system relies
rection (corrected α = 0.0125) was performed to test the heavily on the datasets used for testing. Although the
significance of the differences in accuracy achieved by proposed method utilizes four benchmark datasets with
the proposed method compared with state-of-the-art deep varying perspectives and difficulties, further expansion of
networks. the dataset could provide a more comprehensive assess-
In Table 3, the symbols “",” “!,” and “#” are used in ment of its effectiveness. The inclusion of datasets from
conjunction with each compared method to show how diverse populations, ethnicities, and age groups will help
the proposed method performs in comparison with the assess the generalizability and robustness of the method
compared method. The term “#” implies that the com- across different demographics.
pared method achieved a significantly lower accuracy Although the benchmark datasets used in this study
compared with the proposed method, “!” implies either covered a range of eye shapes, coverings, and angles,
the lack of significant differences or indicates that the challenging conditions may still exist that have not been
compared method achieved a similar result. Additionally, fully addressed. For instance, the presence of occlusions,
SHEHU ET AL. 131

such as eyeglasses or partial obstruction of the eye socket, boundaries. The proposed approach utilizes an IHFI to
can affect recognition accuracy. It would be valuable to extract Gabor features from previously detected eye
investigate the performance of the proposed method socket regions using Adaboost Haar cascade classifiers.
under such challenging conditions and explore tech- These IHFI images captured the most probable and fre-
niques to mitigate their effects. quent features. The generated Gabor features were then
This study primarily focused on identity recognition compared with the dataset from the training set using a
based on eye socket shapes. However, eye socket charac- normalized cross-correlation. The proposed system was
teristics may change over time owing to factors such as evaluated using different datasets including Flickr30,
aging, injury, or surgical intervention. Conducting longi- BioID, Masked At & T, and CK+. These datasets com-
tudinal analyses to examine the stability and consistency prised benchmarking images of various emotions and
of eye socket features within the same individual over an perspectives. Additionally, some images featured individ-
extended period would provide insights into the viability uals wearing glasses, makeup, or face masks, among
of eye socket recognition as a long-term biometric other factors. The experimental results and comparative
modality. studies demonstrated that the proposed method outper-
The proposed method focuses solely on eye socket formed existing state-of-the-art approaches. These find-
shape and region without incorporating other biometric ings highlight the effectiveness of eye socket recognition
modalities. Investigating the potential benefits of fusing using IHFIs within the Gabor transform, as it exhibits
eye socket recognition with other biometric traits, such high-recognition accuracy and robustness to variations in
as iris texture or facial features, could enhance the accu- lighting and noise. This system has the potential for
racy and reliability of identification systems. Exploring diverse applications in biometric identification and medi-
multimodal fusion techniques and assessing their perfor- cal diagnosis.
mance in comparison with unimodal approaches are
valuable directions for future research. C O N F L I C T O F I N T E R E S T S T A TE M E N T
It is also crucial to evaluate the real-time performance The authors declare that there are no conflicts of interest.
of the proposed method to facilitate its practical applica-
tion. Considering the potential use of eye socket recogni- DA TA AVAI LA BI LI TY S T ATE ME NT
tion in security systems, access control, and healthcare Information on downloading the data used in this study
applications, future studies should aim to optimize the can be found online (Github).
algorithm for efficient computation and explore hard-
ware acceleration techniques to enable real-time proces- CODE AVAI LA BI LI TY
sing. Therefore, this method is suitable for real-time The code implemented in this study is as follows
applications. (Github).
Apart from technical aspects, the study of the user
experience, user acceptance, and human factors associ- ORCID
ated with eye socket recognition is vital. Conducting user Harisu Abdullahi Shehu https://orcid.org/0000-0002-
studies, feedback from end-users, and investigating 9689-3290
usability aspects can provide valuable insights into sys-
tem improvement and identify potential challenges in RE FER EN CES
user adoption. 1. N. Min-Allah, F. Jan, and S. Alrashed, Pupil detection schemes
In conclusion, although the proposed eye socket in human eye: a review, Multimed. Syst. 27 (2021), no. 4,
753–777.
recognition system achieved promising results, further
2. M. Mukhiddinov, O. Djuraev, F. Akhmedov, A.
research is required to address these issues. Future Mukhamadiyev, and J. Cho, Masked face emotion recognition
studies will contribute to a more comprehensive under- based on facial landmarks and deep learning approaches for
standing of the capabilities, limitations, and potential visually impaired people, Sens. 23 (2023), no. 3, 1080.
applications of the proposed method, ultimately 3. F. Izhar, S. Ali, M. Ponum, M. T. Mahmood, H. Ilyas, and A.
advancing the field of biometric identification based on Iqbal, Detection & recognition of veiled and unveiled human
eye socket shapes. face on the basis of eyes using transfer learning, Multimed.
Tools Applicat. 82 (2023), no. 3, 4257–4287.
4. C.-L. Lee, W. Pei, Y.-C. Lin, A. Granmo, and K.-H. Liu, Emo-
tion detection based on pupil variation, Healthcare, 11 (2023).
5 | C ON C L U S I ON https://doi.org/10.3390/healthcare11030322
5. I. Kayadibi, G. E. Güraksın, U. Ergün, and N. Özmen Süzme,
This study introduces a fast and innovative method for An eye state recognition system using transfer learning:
eye socket detection that eliminates the need for iris Alexnet-based deep convolutional neural network, Int.
132 SHEHU ET AL.

J. Computat. Intell. Syst. 15 (2022), no. 1. https://doi.org/10. 20. E. Freud, A. Stajduhar, R. S. Rosenbaum, G. Avidan, and T.
1007/s44196-022-00108-2 Ganel, The COVID-19 pandemic masks the way people perceive
6. X. Wang, A. Haji Fathaliyan, and V. J. Santos, Toward shared faces, Scientific reports 10 (2020), no. 1. https://doi.org/10.
autonomy control schemes for human-robot systems: action 1038/s41598-020-78986-9
primitive recognition using eye gaze features, Frontiers in Neu- 21. R. A. S. Naseri, A. Kurnaz, and H. M. Farhan, Optimized face
rorobot. 14 (2020). https://doi.org/10.3389/fnbot.2020.567571 detector-based intelligent face mask detection model in IoT using
7. P. S. Lamba, D. Virmani, and O. Castillo, Multimodal deep learning approach, Appl. Soft Comput. 134 (2023),
human eye blink recognition method using feature level fusion 109933. https://doi.org/10.1016/j.asoc.2022.109933
for exigency detection, Soft Comput. 24 (2020), no. 22, 16829– 22. R. K. Bania, Ensemble of deep transfer learning models for real-
16845. time automatic detection of face mask, Multimed. Tools Appli-
8. L. Zhao, Z. Wang, G. Zhang, Y. Qi, and X. Wang, Eye state rec- cat. 82 (2023), 1–23.
ognition based on deep integrated neural network and transfer 23. V. K. Kaliappan, R. Thangaraj, P. Pandiyan, K.
learning, Multimed. Tools Applicat. 77 (2018), 19415–19438. Mohanasundaram, S. Anandamurugan, and D. Min, Real-time
9. V. K. Hahn and S. Marcel, Biometric template protection for face mask position recognition system using YOLO models for
neural-network-based face recognition systems: a survey of preventing COVID-19 disease spread in public places, Int. J. Ad
methods and evaluation techniques, IEEE Trans. Inform. Hoc Ubiquitous Comput. 42 (2023), no. 2, 73–82.
Forensics Secur. 18 (2022), 639–666. 24. F. Bulut, H. A. Shehu, and I. F. Ince, Performance boosting of
10. J. Z. Lim, J. Mountstephens, and J. Teo, Emotion recognition image matching-based iris recognition systems using deformable
using eye-tracking: taxonomy, review and current challenges, circular hollow kernels and uniform histogram fusion images,
Sens. 20 (2020), no. 8, 2384. https://doi.org/10.3390/s20082384 J. Electron. Imag. 31 (2022), no. 5, 053036–053036.
11. B. Mahanama, Y. Jayawardana, S. Rengarajan, G. 25. https://www.bioid.com/About/BioID-Face-Database
Jayawardena, L. Chukoskie, J. Snider, and S. Jayarathna, Eye 26. F. S. Samaria and A. C. Harter, Parameterisation of a stochas-
movement and pupil measures: a review, Frontiers Comput. tic model for human face identification, (Proceedings of 1994
Sci. 3 (2022), 733531. https://doi.org/10.3389/fcomp.2021. IEEE Workshop on Applications of Computer Vision, Sara-
733531 sota, FL, USA), 1994, pp. 138–142.
12. S. Saurav, P. Gidde, R. Saini, and S. Singh, Real-time eye state 27. F. I. Dataset. www.flickr.com/photos/thefacewemake/albums
recognition using dual convolutional neural network ensemble, 28. P. Lucey, J. F. Cohn, T. Kanade, J. Saragih, Z. Ambadar, and I.
J. Real-Time Image Process. 19 (2022), no. 3, 607–622. Matthews, The extended Cohn-Kanade dataset (CK+): a com-
13. S. Lee, C. Y. Low, J. Kim, and A. B. J. Teoh, Robust sclera rec- plete dataset for action unit and emotion-specified expression,
ognition based on a local spherical structure, Expert Syst. Appli- (IEEE Computer Society Conference on Computer Vision and
cat. 189 (2022), 116081. https://doi.org/10.1016/j.eswa.2021. Pattern Recognition - Workshops, San- Francisco, CA, USA),
116081 2010, pp. 94–101.
14. C. K. Sharma, services across different countries, and the 29. K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for
transitions experienced in their education sectors as well, in image recognition, (IEEE Conference on Computer Vision and
response to Covid-19. A holistic read on the pandemic, this Pattern Recognition, Las Vegas, NLV, USA), 2016, pp. 770–778.
book will be of interest to scholars and researchers of sociol- 30. C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna,
ogy, medical anthropology, sociology of health, pandemic Rethinking the inception architecture for computer vision, (IEEE
and health studies, political studies, social anthropology, Conference on Computer Vision and Pattern Recognition, Las
public. Vegas, NV, USA), 2016, pp. 2818–2826.
15. D. Fitousi, N. Rotschild, C. Pnini, and O. Azizi, Understanding 31. K. Simonyan and A. Zisserman, Very deep convolutional net-
the impact of face masks on the processing of facial identity, works for large-scale image recognition, arxiv preprint, 2014.
emotion, age, and gender, Frontiers Psychol. 12 (2021), 4668. https://doi.org/10.48550/arXiv.1409.1556
https://doi.org/10.3389/fpsyg.2021.743793 32. S. Hangaragi, T. Singh, and N. Neelima, Face detection and rec-
16. D. J. Carragher and P. J. B. Hancock, Surgical face masks ognition using face mesh and deep neural network, Procedia
impair human face matching performance for familiar and Comput. Scie. 218 (2023), 741–749.
unfamiliar faces, Cognit. Research: Principles Implications 5 33. P. Foggia, A. Greco, A. Roberto, A. Saggese, and M. Vento,
(2020), no. 1, 1–15. Identity, gender, age, and emotion recognition from speaker
17. H. A. Shehu, W. N. Browne, and H. Eisenbarth, A comparison voice with multi-task deep networks for cognitive robotics, Cogn.
of humans and machine learning classifiers categorizing emo- Comput. 16 (2024), 2713–2723.
tion from faces with different coverings, Appl. Soft Comput. 130 34. P. before and after makeup. https://104fashion.com/pictures-
(2022), 109701. of-before-and-after-makeup/
18. E. Noyes, J. P. Davis, N. Petrov, K. L. H. Gray, and K. L. 35. T. face we make. https://pictures.dxtr.com/The-Face-We-
Ritchie, The effect of face masks and sunglasses on identity and Make/n-D8Ps72/TFWM-Cropped/
expression recognition with super-recognizers and typical 36. P. Viola and M. J. Jones, Robust real-time face detection, Int.
observers, Royal Soc. Open Sci. 8 (2021), no. 3. https://doi.org/ J. Comput. Vis. 57 (2004), 137–154.
10.1098/rsos.201169 37. R. Lienhart and J. Maydt, An extended set of Haar-like features
19. H. M. Shahzad, S. M. Bhatti, A. Jaffar, and M. Rashid, A multi- for rapid object detection, (Proceedings. International Confer-
modal deep learning approach for emotion recognition, Intell. ence on Image Processing, Rochestrer, NY, USA). https://doi.
Autom. & Soft Comput. 36 (2023), no. 2, 1561–1570. org/10.1109/ICIP.2002.1038171
SHEHU ET AL. 133

AUTHOR BIOGRAPHIES
Additionally, he worked as an Assistant Professor of
Computer Engineering at Gediz University, Izmir,
Harisu Abdullahi Shehu received
Turkey, from November 2014 to July 2016. He was an
his B.Sc. degree in Computer Engi-
Assistant Professor at Nisantasi University, Depart-
neering from Gediz University,
ment of Computer Engineering, Istanbul, Turkey,
Turkey, and was the top-ranked stu-
between October 2019 and June 2024. He is currently
dent in this department. He also
an Assistant Professor in the Department of Software
received his M.Sc. degree in Com-
Engineering, Istinye University, Istanbul, Turkey. His
puter Engineering from Pamukkale
research interests include image processing, computer
University, Turkey, and his Ph.D. degree from
vision, pattern recognition, and human-computer
Victoria University of Wellington, New Zealand, in
interaction.
the same field. He serves as a reviewer for interna-
tional journals, such as IEEE Transactions on Affec- Faruk Bulut is an Associate Profes-
tive Computing and IEEE Transactions on sor in the Computer Engineering
Cybernetics, and for conferences, such as IEEE Con- Department at Istanbul Esenyurt
gress on Evolutionary Computation and the Asian University. His research interests
Joint Conference on Artificial Intelligence. include machine learning, image
He is currently a Researcher with Victoria University processing, optimization methods,
of Wellington, New Zealand. His research interests decision-support systems, and graph
include computer vision, machine learning, and emo- theory. Throughout his career, Bulut has assumed
tion detection from facial movement patterns and fol- roles as an educator, instructor, engineer, project
lowing physiological changes. manager, and consultant and has contributed to
numerous projects in both the private and govern-
Ibrahim Furkan Ince received his
ment sectors.
Ph.D. in Information Technology
Convergence Design from the Gradu-
ate School of Digital Design, Kyung-
sung University, Busan, Republic of
How to cite this article: H. A. Shehu, I. F. Ince,
Korea, in 2010. He was enrolled in
and F. Bulut, Enhancement of eye socket recognition
postdoctoral research at the Univer-
performance using inverse histogram fusion images
sity of Tokyo, Japan, between 2010 and 2012. He
and the Gabor transform, ETRI Journal 47 (2025),
worked as a Chief Research Engineer at the Hanwul
123–133, DOI 10.4218/etrij.2023-0395
Multimedia Communication Co., Ltd., Busan, Repub-
lic of Korea, from May 2012 to May 2014.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy