Speaker Recognition System: by Divya V
Speaker Recognition System: by Divya V
Speaker Recognition System: by Divya V
SYSTEM
By Divya V
INTRODUCTION
To verify a speaker identity or identify the speaker from a
known ensemble
APPLICATIONS:
banking by telephone
telephone shopping
database access services
information services
voice mail
security control for confidential information areas
remote access to computers.
GENERAL REPRESENTION
REFERENCE
PATTERNS
SIGNAL
PROCESSOR
COMPARSION
DECISION
LOGIC
X
PATTERN
VECTOR
D
DISTANCE
IDENTIFICATION
S(n)
SUBAREAS OF SPEECH
RECOGINITION
SPEAKER
VERIFICATION
Identity is claimed by
user
Single comparison
between a set of
measurement and
reference
SPEAKER
IDENTIFICATION
Identification among
N speaker in user
population is made
N comparison
between a set of
measurement and
reference
SPEAKER VERIFICATION
STORED
VOICE
PATTERNS
COMPARE
DECIDE
ACCEPT
OR
REJECT
CLAIMED
IDENTITY
SPOKEN
PHRASES
TRANSACTION
REQUESTED
COMPUTER
SPEECH
ANSWER BACK
CUSTOMER
TO BE
VERIFIED
SIGNAL PROCESSING ASPECT OF
SPEAKER VERIFICATION
LPF
LPF
ENDPOINT
DETECTOR
PITCH
DETECTOR
ENERGY
MEASUREMENT
LPC
ANALYSIS
FORMANT
ANALYSIS
DYNAMIC
TIME
WARPING
TO
REGISTER
PATTERNS
COMPUTE
DISTANCE
ACCEPT
REJECT
S(N)
STORED
REFERENCE
PATTERN
TIME WRAPING
Pitch, intensity and formant variation of
speaker is not same all the time
Hence warping of time scale t of
reference utterance is done.
=t+q(t)
WARPING FUNCTION
For linear warping:
Let the points in
measured contour be n=1,2.N
reference contour be m=1,2.M
We wish to choose a time warping function
m=w(n)
The boundary condition on w(n) are:
w(1)=1 beginning point
w(N)=M ending point
Then time warping function
w(n)=[(M-1/N-1)(n-1)+1]
WARPING FUNCTION
For non linear warping function:
To limit the degree of nonlinearity, the warping function cannot change by more
than 2 grids at any index n
Thus
w(n+1)-w(n)=0,1,2 if w(n) w(n-1) if at previous grid index the warped
index changed
=1,2 if w(n)= w(n-1) if at previous grid index the warped
index stayed constant
Similarity measure between reference at grid index n and test at grid index m is
measured
Similarity measure used to determine the path of warping function , locally
minimizes the maximum total distance
COMPUTING DISTANCE
To compare the overall distance to an appropriately
chosen threshold
The contour distance measure is
d
j
= [(a
js
(i) a
jr
(i))/ ]
2
a
js
(i)-value of the jth measurement counter at time i
a
jr
(i)-value of the jth reference counter at time i
-standard deviation of jth measurement at time I
The overall distance D is
D= w
j
d
j
w
j
- jth weight chosen on basis of effectiveness of jth
measurement in verifying the speaker
i
j
SPEAKER IDENTIFICATION
Processing similar to speaker verification
N distance measurements are made
Final decision for speaker identification is
to choose the speaker whose reference
patterns is closest in distance to sample
pattern
DISTANCE MEASUREMENT BY ATAL
Let x be an L dimensional column vector representing the input pattern
Assumption:
It is joint probability density function of measurements for i
th
speaker is multi
dimensional gaussian distribution with m
i
and covariance w
i
g
i
(x)=(2)
-1/2
w
i
-1/2
exp[-1/2(x- m
i
)
t
w
i
-1
(x- m
i
)]
Decision rule for speaker identification to minimize the probability of error
p
i
g
i
p
j
g
j
for all i j
Decision class i if
d
i
(x)= 1/2(x- m
i
)
t
w
i
-1
(x- m
i
)+1/2 ln w
i
- ln p
i
d
j
(x) for all i j
d
i
=(x- m
i
)
t
w
i
-1
(x- m
i
)