Automatic 3D Face Recognition Combining Global Geometric Features With Local Shape Variation Information

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Automatic 3D Face Recognition Combining Global Geometric Features with

Local Shape Variation Information


Chenghua Xu1, Yunhong Wang1, Tieniu Tan1, Long Quan2
1
Center for Biometric Authentication and Testing, National Laboratory of Pattern Recognition,
Institute of Automation, Chinese Academy of Sciences, Beijing, P. R. China, 100080
E-mails: {chxu, wangyh, tnt }@nlpr.ia.ac.cn
2
Department of Computer Science, Hong Kong University of Science and Technology,
Clear Water Bay, Kowloon, Hong Kong
E-mails: quan@cs.ust.hk

With the development of 3D acquisition system, 3D


Abstract capture is becoming faster and cheaper, and face
recognition based on 3D information is attracting more
Face recognition is a focused issue in pattern recognition and more attention. Some earlier researches on curvature
over the past decades. In this paper, we have proposed a analysis [1,2,3] were proposed for face recognition based
new scheme for face recognition using 3D information. In on the high-quality range data acquired from 3D laser
this scheme, the scattered 3D point cloud is first scanners. Recently, Blanz et al. [4,5] constructed a 3D
represented with a regular mesh using hierarchical mesh morphable model with a linear combination of the shape
fitting. Then the local shape variation information is and texture of multiple exemplars. That model could be
extracted to characterize the individual together with the fitted to a single image to obtain the individual
global geometric features. Experimental results on parameters, which were used to characterize the personal
3D_RMA, a likely largest 3D face database available features. Their results seemed very promising except that
currently, demonstrate that the local shape variation the modeling process incurred a high computational cost.
information is very important to improve the recognition Chen et al. [6] treated face recognition as a 3D non-rigid
accuracy and that the proposed algorithm has promising surface matching problem and divided the human face
performance with a low computational cost. into rigid and non-rigid regions. The rigid parts are
Keywords: 3D face recognition, shape variation, mesh represented by point signatures to identify the individual.
model, Gaussian-Hermite moments Bronstein et al. [7] represented facial surface based on
geometric invariants to isometric deformations. They
realized multi-model recognition by integrating flattened
1. Introduction textures and canonical images. Their algorithm was
robust to some expression variations. Beumier et al. [8,9]
Nowadays biometric identification has obtained much developed a 3D acquisition prototype based on structured
attention due to the urgent need for more reliable personal light and built a 3D face database. They also proposed
identification. Of all the biometrics features, face is two methods of surface matching and central/lateral
among the most common and most reachable so that face profiles to compare two instances. Both of them
recognition remains one of the most active research issues constructed some central and lateral profiles to represent
in pattern recognition. In the past decades, most work the individual, and obtained the matching value by
focuses on the source of 2D intensity or color images. minimizing the distance of the profiles. It should be noted
Since the accuracy of 2D face recognition is influenced that there are two main difficulties facing 3D face
by variations of poses, expressions, illumination and recognition: high computational and spatial cost and
subordinates, it is still difficult to develop a robust inconvenient 3D capture. The existing methods usually
automatic 2D face recognition system. have a high computational cost [4,5,8,9] or are tested on a
The 3D facial data can provide a promising way to small database [1,2,3,6].
understand the feature of the human face in 3D space and In our previous work [ 10 ], we used the global
has potential possibility to improve the performance of geometric feature to realize the face recognition. Further,
the system. There are some distinct advantages in using we observed that the shape variation of the local areas
3D information: sufficient geometrical information, (e.g. mouth, nose, etc.) was also crucial for characterizing
invariance of measured features relative to transformation the individual. In this paper, we develop an automatic
and capture process by laser scanners being immune to face recognition method combining the global geometric
illumination variation. features with local shape variation information.
The main contributions of this paper are as follows: 1) steps: initialization of the basic mesh and fitting of the
A robust method is developed to build the regular mesh hierarchy meshes.
model based on the scattered point cloud. 2) The local Due to the limited quality, the nose seems to be the
shape variation information is extracted to represent the only facial feature providing robust geometrical features
face feature together with the global geometric features. for preliminary effort. We localize the prominent nose in
Here we first define a metric to quantify the local shape the point cloud and utilize it to initialize the basic mesh.
and then Gaussian-Hermite moments [11,12] are applied After initialization, the basic mesh is aligned with the
to describe the shape variation. point cloud. Nevertheless, the basic mesh is so coarse that
The remainder of this paper is organized as follows. In the basic contour of the human face cannot be described.
Section 2, we introduce how to obtain the regular mesh The non-linear subdivision scheme [13] is utilized to
model from the 3D point cloud. The process of feature refine the basic mesh, and at the same time the refined
extraction is described in Section 3. Section 4 illustrates mesh is regulated according to the data at each level. With
the classifiers for face recognition. Section 5 reports the the proceeding of refinement and regulation, the mesh can
experimental results and gives some comparisons with represent the individual well level by level.
existing methods. Finally, Section 6 summarizes this Fig.2 shows the mesh after regulation in different
paper and future work. refining levels. The coarse mesh does not describe the
human face well though it attempts to approach the point
2. Face modeling cloud. The mesh of level four is dense enough to
represent the face surface. Of course, the denser the mesh
In this work, we use the face database 3D_RMA [8], in is, the better the face is represented. Obviously, the denser
which each sample is represented with one 3D scattered mesh costs more time and space. In this paper, we use the
point cloud. We intend to build a regular mesh with a mesh refined four times.
fixed number of nodes and facets to represent the shape
of one human face. Moreover, the different meshes have 2.2. Pose acquisition by regulating mesh models
the corresponding nodes and the same pose to the average
model. The different point clouds have different position and
Our modeling process includes three steps: pre- rotation relative to the 3D equipment. This difference is
modeling, pose acquisition and re-modeling as outlined in usually called as pose variation. The mesh model
Fig.1. This process is described in [10], and here we only obtained from the previous step has the same pose to its
describe it concisely. corresponding point cloud. Thus we can get the pose
parameters from the mesh models rather than from the
3D point cloud
point clouds directly, which will save much time.
Pre-modeling First, an average mesh model is obtained by averaging
the mesh models from pre-modeling process. This
Pose acquisition average model is considered as the ground model and all
the models are rotated and translated to align with it.
Finally, we obtain the result that the models have the best
Re-modeling
overlap with the average model, as well as the values of
Figure 1. Modeling process pose.

2.1. Pre-modeling 2.3. Re-modeling

Beginning with a simple basic mesh (see Fig.2a), a We transform the original point clouds using the
regular and dense mesh model is generated to fit the 3D obtained values from previous section so that they have
scattered point cloud. We develop a universal fitting the same pose with the average mesh model. The
algorithm for regulating the hierarchical mesh to be transformed point cloud is modeled again in the same
conformed to the 3D points. This process includes two way as the pre-modeling stage. Thus after this stage, a

(a) (b) (c) (d) (e)


Figure 2. The regulated mesh models in different levels. (a) Basic mesh. (b) Level one. (c) Level two. (d) Level
three. (e) Level four. Each mesh is showed in front and profile views.
regular mesh model for each point cloud is built. (not the edge vertex) is always six. The shape metric of
Moreover, all these mesh models have the same pose and this vertex can be described with a vector whose
represent the facial geometric shape realistically. Next we component is the distance from pe to its neighboring
will use this kind of model to extract the individual vertices counterclockwise from the top left vertex, i.e.
features. s e = {d e1 , d e 2 , L, d e6 } (2)
where d ei is the distance from p e to p ei . This vector
3. Feature extraction
describes the shape near this vertex. According to this
Our feature vector to characterize the individual metric, we can describe the shape of the four marked
includes two parts: the global geometric features and areas with the following vectors respectively
local shape variation information. In the following, we S mouth = {s m1 , s m 2 , L , s mn }
will discuss them respectively. S nose = {s n1 , s n 2 , L , s nn }
(3)
S leye = {s le1 , s le 2 , L , s len }
3.1 Global geometric features
S reye = {s re1 , s re 2 , L , s ren }
From the analysis of the above modeling process, each where s i is the shape vector of one vertex in its
point cloud is described with a regular mesh. All these corresponding marked areas.
mesh models have the same pose and the corresponding The nth order 1-D G-H moment Mn(x,S(x)) of a signal
nodes, which have the same position in X-Y plane and S(x) is defined as [12]:
different values along Z-axis. Thus we can build a feature ∞
M n ( x) = ∫ Bn (t ) S ( x + t ) dt n = 0,1,2,..... (4)
vector, which describes the global geometric features as −∞
follows with
V geometric = {Z (v1 ), Z (v 2 ),L , Z (v n )} (1) B n (t ) = g (t , σ ) H n (t / σ )
where Z (vi ) is the Z-coordinate of node vi of the d n exp(−t 2 )
H n (t ) = (−1) n exp(t 2 ) (5)
mesh model. We used this normalized vector to dt n
characterize the individual previously [10] and it has g (t , σ ) = (2πσ 2 ) −1 / 2 exp(− x 2 / 2σ 2 )
limited ability to improve the recognition accuracy.
where g(t,σ) is a Gaussian function and Hn(t) is a scaled
3.2 Local shape variation information Hermite polynomial function of order n. G-H moments
have many excellent performances, especially insensitive
With our observation, we find that shape variation, to noise generated during differential operations. The
especially near the areas such as mouth, nose and eyes, is parameter σ and the order of G-H moments need to be
the important information to characterize the individual. determined by experiments. Here we use 1st and 2nd
In signal processing, Gaussian-Hermite (G-H) moments order G-H moments to analyze the shape variation when
provide an effective way to quantify the signal variation σ = 2.0 .
and have wide applications in signal and image To each shape vector in Eq.3, we calculate its 1st and
processing [11,12]. Here we first define a metric to 2nd order G-H moments, thus obtaining eight 1-D vectors,
describe the shape of the principle areas with a 1-D vector i.e. M m,1 , M m,2 , M n,1 , M n,2 , M le,1 , M le,2 , M re,1 , M re,2 . Each
and then use the G-H moments to analyze the shape vector describes the shape variation of one marked area.
variation. After being normalized, these vectors are connected
To reduce redundance, we only consider the areas with together to form one 1-D feature vector.
larger shape variation. We estimate the position of the M = {M m,1 , M m,2 , L , M re,1 , M re, 2 } (6)
four areas (mouth, nose, left eye, right eye) in the average We use this feature vector to describe the shape variation
mesh model and mark the same areas in the individual information of the marked areas.
mesh model at the same position as shown in the left
image of Fig.3. Although the marked areas can only label
the similar corresponding areas in the individual model, it
is enough for the following process. pe1 pe6
de6
To transfer the 3D shape into 1D vector, we first define de1
de5 pe5
a metric to describe the shape of one vertex. To each de2 pe
pe2
vertex pe in the marked area, its neighboring vertices de4
de3
{ p e1 , p e 2 , L , p en } can be obtained easily as shown in the pe3 pe4
right image of Fig.3. In our regular mesh model, the Figure 3. The marked areas and shape representation of one
number of neighboring vertices of the common vertex vertex
3.3 Feature vector Compared with the data obtained from the laser scanner
(one example showed in Fig.4b), these point clouds are of
We connect the geometric vector V geometric and shape limited quality (Fig.4a).
The database includes 120 persons and two sessions:
variation vector M together to form the feature vector
Nov. 97(session1) and Jan. 98 (session2). In each session,
F = {V geometric , M } (7)
each person is sampled three shots, corresponding to
It not only describes the global geometric feature, but also central, limited left/right and up/down poses. People
contains the local shape variation information. In our case sometimes wear their spectacles, and beards and
that the mesh model has 545 nodes, the geometric vector moustaches are also represented. Some people smile in
V geometric contains 545 components as well. The shape some shots. From these sessions, two databases are built:
variation vector M contains 156*6*2 components since Automatic Database (ADB, 120 persons) and Manual
the four marked areas have 156 vertices. Thus the total Database (MDB, 30 persons). The data in MDB has better
number of components in the feature vector F is 2417. quality than that in ADB. In this paper, we test our
proposed method on the data set of session1, session2 and
4. Matching session1-2 (blending two sessions) in ADB and MDB.

The point cloud is represented with a regular mesh


model in Section 2 and then the mesh model is
characterized by a 1-D feature vector F as described in
Section 3. To reduce the computational cost and improve
the recognition performance, we use the principal
component analysis (PCA) to obtain a lower-dimension
(a) (b)
vector and then nearest neighbor classifier (NN) is used
for classification. Figure 4. 3D data. (a) From 3D_RMA; (b) From laser
There exist two popular methods for linear scanners.
dimensionality reduction, i.e. PCA [14] and Fisher linear
discriminant (FLD) [15]. As discussed in [15], the FLD 5.2. Experimental results
method usually needs more training samples to obtain the
better result. In our case, we have limited samples so that During our processing, the dimensionality of the
we adopt the PCA to transform the higher-dimension original feature vector F is reduced using PCA. The
vector F into the lower-dimension vector G . dimensionality of the reduced feature vector affects the
We use the nearest neighbor classifier (NN) to solve recognition rate strongly. Fig.5 describes variations of the
the classification problem in the lower-dimension space. recognition rate with the increasing dimensionality of the
The similarity between two feature vectors is measured reduced feature vector based on the session2 set of MDB.
with Euclidean distance d i = (G − Gi )(G − Gi )T . Here our From it, we can find that with the increase of
focus is to validate the separability of the proposed dimensionality of the reduced feature vector, the
features and only use the simple classifiers. More recognition rate also rapidly increases. But when the
sophisticated classifiers can be used to improve the dimensionality is up to 50 or much higher, the recognition
recognition accuracy. rate nearly stabilizes at a very certain level (about 94.4%).
Thus, we use only 50 features in the following
5. Experiments experiments.
Identification accuracy is evaluated with the different
To demonstrate the performance of our proposed sets in 3D_RMA. Considering the limited quantity of the
method, we implement it on the 3D database 3D_RMA samples, we use the strategy of Leave-one-out Cross
[8,9]. All these tests are finished under the hardware Validation. Each time we leave one sample out as a test
environment of PⅣ 1.3G CPU and 128M DRAM. The sample and train on the remainder. After computing the
modeling experiments are done with C++ and OpenGL similarity differences between the test sample and the
and the others are finished with Matlab (6.1 platform). training data using Euclidean distance, the nearest
neighbor (NN) is then applied to classification. To
5.1. 3D face databases validate the effectiveness of the shape variation
information, we estimate the recognition rate using the
Our proposed method is tested on the 3D face database feature vector V geometric only including global geometric
3D_RMA [8,9], where each face is described with a features (GGF) and the feature vector F containing GGF
scattered 3D point cloud obtained by structured light. and the shape variation information (GGF+SVI). Table 1
Figure 5. Recognition performance under different
dimensionality of features
Figure 6. CMS curves for identification performance
Table 1. CCR in 3D_RMA (%)
Database GGF GGF+SVI
Manual DB, session1
(30 persons, 3 instances for each) 92.2 95.6
Manual DB, session2
(30 persons, 3 instances for each) 84.4 94.4
Manual DB, session1-2
(30 persons, 6 instances for each) 93.9 96.1
Automatic DB, session1
(120 persons,3 instances for each) 59.2 66.9
Automatic DB, session2
(120 persons,3 instances for each) 59.2 66.7
Automatic DB, session1-2
(120 persons,6 instances for each) 69.4 72.4
Figure 7. ROC curves for verification performance.
summarizes the Correct Classification Rate (CCR) with than that in MDB.
these two different manners.
In addition, we use one more familiar method, 5.3. Comparisons and discussions
Cumulative Match Score (CMS) [16], to evaluate the
identification performance of GGF+SVI. Fig.6 shows the We make detailed comparisons with some existing
CMS curves using the NN classifier on three data sets of methods for 3D face recognition to show the performance
MDB. In fact, the CCR is equal to the case that Rank=1. of the proposed algorithm.
We test the verification performance using (1) Our method has a lower computational cost. Our
leave-one-out scheme as well. On each test, one is the modeling process costs more time (about 2s for each) and
probe sample and the remaining samples are trained. The the feature extraction costs about 0.2s. However, the
probe sample is classified with the training set. In each matching process costs little time due to only calculating
iteration, there is only one true test since we know the the Euclidean distance between two points in a lower-
classification of the probe sample. Fig. 7 shows the ROC dimensional space. Beumier et al. [8] built the database
curves for different data sets in MDB. 3D_RMA and developed surface matching (SURF) and
From an overall view of Table 1, Fig.6-7, we can draw central/lateral profiles (CLP) to realize face
the following conclusions: authentication. Their reported verification performances
a) The highest recognition is up to 96.1% (30 persons) of automatic matching algorithms were close to ours as
and 72.4% (120 persons). Although the testing shown in Table 2. However, their matching process was
database is not big enough, this result is obtained in an optimization process, which incurred a high
the fully automatic way, which is fairly encouraging. computational cost (at least 0.5s for each matching). The
b) Shape variation is the important information to less matching cost means that it takes less time to search
characterize the individual. The feature vector for the corresponding object in a big database.
containing the shape variation information improves Blanz et al. [4,5] developed an excellent method to fit
the CCR distinctly (see Table 1). a 3D deformable model to the image and used the
c) The increase of the training samples can improve the obtained shape and texture coefficients for face
verification and identification performance (see recognition. Due to the largely different background, it
Table 1 and Fig.6-7). The sets blending two sessions is no sense to compare our recognition performances
always have better performance. directly. But it should be noted that their fitting procedure
d) Noise and volume of the tested database affect the is slow and requires some manual interaction.
CCR strongly. In Table 1, the CCR in ADB is lower (2) Our algorithm is tested on a bigger and more
complex database. Gordon [2] obtained the higher Acknowledgements
recognition rate (100%) using depth and curvature
features since they adopted a small database (only 8 This work is supported by research funds from the
Table 2. EER of our algorithm and automatic matching
Natural Science Foundation of China (Grant No.
algorithms in Beumier et al. [8] on manual DB 60121302 and 60332010) and the Outstanding Overseas
Algorithms Session1 Session2 Session1-2 Chinese Scholars Fund of CAS (No.2001-2-8). We would
Beumier [8], SURF 8.0% 7.0% 13.0% like to thank Dr. C. Beumier for the database 3D_RAM.
Beumier [8], CLP 4.75% 6.75% 7.25% Also thanks to our colleagues for constructive
Ours 5.5% 4.5% 4.0% suggestions.
persons) with high-quality range data (similar to Fig.4b)
and without eyeglasses, beards or pose variations. References
Chua et al. [6] used the rigid region to characterize the
[1] J.C. Lee, and E. Milios, “Matching Range Images of Human
individual in order to conquer the influence of the Faces”, Proc. ICCV’90, pp.722-726, 1990.
expressions. They tested their algorithm with only six [2] G.G. Gordon, “Face Recognition Based on Depth and
objects (four expressions for each, without pose Curvature Features”, Proc. CVPR’92, pp.108-110, 1992.
variations) and obtained promising results. Our algorithm [3] Y. Yacoob and L.S. Davis, “Labeling of Human Face
is performed on 3D_RMA, which contains 120 persons Components from Range Data”, CVGIP: Image
with different quality and limited pose and expression Understanding, 60(2):168-178, 1994.
variations. [4] V. Blanz and T. Vetter, “Face Identification Based on Fitting
a 3D Morphable Model”, IEEE Trans. on PAMI, Vol.25,
(3) From the analysis in Section 2, our modeling No.9, pp.1063-1074, 2003.
scheme can overcome the pose variation effectively. [5] V. Blanz, S. Romdhani and T. Vetter, “Face Identification
However, the feature extraction is strongly influenced by across Different Poses and Illumination with a 3D
expressions. Chua et al. [6] had done some work to deal Morphable Model”, Proc. FG’02, pp.202-207, 2002.
with the expression variation on a small database. This is [6] C.S. Chua, F. Han, and Y.K. Ho, “3D Human Face
also our focused direction to further improve the Recognition Using Point Signiture”, Proc. FG’00,
pp.233-239, 2000.
recognition accuracy in the future work. In addition, some [7] A.M. Bronstein, M.M. Bronstein, and R. Kimmel,
mesh models cannot describe the correct shape of the real “Expression-Invariant 3D Face Recognition”, Auto- and
person due to noise, thus resulting in the false feature Video- Based Person Authentication (AVBPA 2003), LCNS
vector. So increasing the precision of mesh model is 2688, pp.62-70, 2003.
another avenue to improve the recognition performance. [8] C. Beumier and M. Acheroy, “Automatic 3D Face
In Eq.7, we connect the geometric vector V geometric Authentication”, Image and Vision Computing,
18(4):315-321, 2000.
and shape variation vector M directly. Intuitively, it is [9] C. Beumier and M. Acheroy, “Automatic Face Authenti-
not the best way to fuse them since they have different cation from 3D Surface”, Proc. BMVC, pp.449-458, 1998.
properties. In the future, we can carry out a deep research [10] C. Xu, Y. Wang, T. Tan, and L. Quan, “A New Attempt to
on this topic and obtain a better fusing algorithm. Face Recognition Using 3D Eigenfaces”, ACCV’04,
pp.884-889, 2004.
[11] S. Liao, M. Pawlak, “On Image Analysis by Moments”,
6. Conclusions IEEE Trans. on PAMI, Vol.18, No.3, pp.254-266, 1996.
[12] J. Shen, W. Shen and D. Shen, “On Geometric and
In this paper, we have proposed a new scheme for 3D Orthogonal Moments”, Inter. Journal of Pattern
face recognition. In this scheme, the 3D point cloud is Recognition and Artificial Intelligence, Vol.14, No.7,
first represented with a regular mesh using hierarchical pp.875-894, 2000.
[13] C. Xu, L. Quan, Y. Wang, T. Tan, M. Lhuillier, “Adaptive
mesh fitting. Based on the observation that local shape Multi-resolution Fitting and its Application to Realistic
variation is the important information to characterize the Head Modeling”, Geometric Modeling and Processing
individual, our feature vector is constructed by combining 2004, to be appeared.
the global geometric features and the local shape [14] M. Turk, and A. Pentland, “Eigenfaces for Recognition”,
variation information together. We test the proposed Journal of Cognitive Neuroscience, 3(1): 71-86, 1991.
algorithm on 3D_RMA and the encouraging results show [15] P.N. Belhumeur, J.P. Hespanha and D.J. Kriegman,
the importance of the shape variation to characterize the “Eigenfaces vs. Fisherfaces: Recognition Using Class
Specific Linear Projection”, IEEE Trans. on PAMI, Vol.19,
individual. Compared with previous works, our algorithm No.7, pp.711-720, 1997.
demonstrates its outstanding performance. In the future, [16] P.J. Phillips, H. Moon, S.A. Rizvi, and P.J. Rauss, “The
we will focus on searching for some invariant features to Feret Evaluation Methodology for Face-Recognition
expressions and build larger 3D database for estimating Algorithm”, IEEE Trans. on PAMI, Vol.22, No.10,
the performance of the 3D recognition algorithm. pp.1090-1104, 2000.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy