Use of Spectral Autocorrelation in Spectral Envelope Linear Prediction For Speech Recognition
Use of Spectral Autocorrelation in Spectral Envelope Linear Prediction For Speech Recognition
Use of Spectral Autocorrelation in Spectral Envelope Linear Prediction For Speech Recognition
Under the guidance of
Prof. M. B. MANJUNATHA, M.Tech (PhD)
Introduction
• Out-Line of the project
1.Introduction to Speech
i) Speech Production
ii) Speech Recognition
2. Implementation
3.spectral enevelope LPC Analysis
4.Speech recognition using Dynamic Time Warping (DTW)
5.Result
1. INTRODUCTION TO SPEECH
i) Speech Production:
Voiced Sounds :
• Voiced sounds are produced when the vocal cords vibrate. These are
quasi-periodic pulses of air which excite the vocal tract.
• These are labeled as / u /, / d /, / w /, / i /, and / e /.
1. Parameter Estimation
2. Parameter Comparison
3. Decision Making
IMPLEMENTATION
Input Speech signal
•
Pre-emphasising
Hamming windowing
Spectral autocorrelation
LPC Coefficients
Dynamic time wrapping Reference
Word
Recognized Word
Pre-Emphasis
• The spectrum for voiced segments has more energy at lower frequencies
than higher frequencies.
– spectral tilt
– Spectral tilt is caused by the nature of the glottal pulse
• h = hamming(n);
• M2 = diag(h) * M;
Common window shapes
Rectangular window
Hamming window
Common window shapes
Linear prediction
• Linear Predictive Coding (LPC) provides
– low-dimension representation of speech signal at one
frame
– representation of spectral envelope, not harmonics
– “analytically tractable” method
– some ability to identify formants
2
M2
p
sn (m) ak sn (m k )
mM1 k 1
p M2 M2
aˆk
k 1
sn ( m i ) sn ( m k )
m M1
s ( m i ) s ( m)
m M1
n n 1 i p
2
M2 p
En s(m) ak s(m k )
m M1 k 1
M2
p
p p
En s (m) 2s ( m) ak s( m k ) ak s (m k ) ak s(m r )
2
m M1 k 1 k 1 r 1
2 p
s ( m ) 2 s ( m ) a1 s ( m 1) a1 s ( m 1) ar s ( m r )
M2
r 1
p
En 2s (m)a2 s (m 2) a2 s (m 2) ar s (m r )
m M1 r 1
p
2 s ( m) a p s ( m p ) a p s ( m p ) a r s ( m r )
r 1
M2
(m) 2 s (m) a1s ( m 1) 2a1s ( m 1) a1s ( m 1) ... a1s ( m 1) a p s ( m p )
En
a1
0 s
m M1
2
p M2 M2
a s (m i ) s (m k ) s ( m i ) s (m)
k 1
k
m M1 m M1
1 i p
LPC Autocorrelation Method
Autocorrelation: measure of periodicity in signal
M2
n (i, k ) s (m i ) s (m k )
mM1
n n
aˆ (i, k ) (i,0)
k 1
k n n 1 i p
aˆ R (| i k |) R (i)
k 1
k n n 1 i p
n (i, k ) Rn (| i k |)
N 1 k
Rn (k ) sˆ (m)sˆ (m k )
m 0
n n
so the set of equations for ak (eqn (7)) can be combo of (7) and
(12):
In matrix form, equation (14) looks like this:
k 1
k Rn (| i k |) Rn (i ) 1 i p
E ( 0) R(0)
i 1
ki R(i ) (ji 1) R(i j ) E ( i 1) 1 i p
j 1
i(i ) ki
(ji ) (ji 1) ki i(i j1) 1 j i 1
E (i ) (1 ki2 ) E ( i 1)
aˆ j (j p )
We can compute spectral envelope magnitude from LPC
parameters
by evaluating the transfer function S(z) for z=ej:
j G G
S (e )
A(e j ) p
1 a
k 1
k e jk
Finding frequency envelope using LPC method
• for col =1:nbFrame
• % compute Mth-order autocorrelation function:
• rx = zeros(1,Or+1)';
• speech1 = M2(:,col)'+0.000001;
• for i=1:Or+1,
• rx(i) = rx(i) + speech1(1:n-i+1) * speech1(1+i-1:n)';
• end
• % prepare the M by M Toeplitx covariance matrix:
• covmatrix = zeros(Or,Or);
• for i=1:Or,
• covmatrix(i,i:Or) = rx(1:Or-i+1)';
• covmatrix(i:Or,i) = rx(1:Or-i+1);
• end
• % solve "normal equations" for prediction coeffs
• Acoeffs = - covmatrix \ rx(2:Or+1);
• Alp = [1,Acoeffs']; % LP polynomial A(z)
•
dbenvlp(:,col) = 20*log(abs(freqz(1,Alp,n*2)'));
• end
Dynamic Time Warping (DTW
• The SELP analysis is evaluated using a Dynamic time wrapping
• T = { t1,t2,………,ti,….tn} , R= {r1,r2,………,ri,….,rm}
• A matrix of m x n is created