Abstract— The term “Quality of Speech” in Speech Enhancement Spectral subtraction is used for enhancing speech degraded by
techniques is associated with Clarity and Intelligibility. Till now additive stationary background noise, but it is affected by
due to the variable nature and characteristics of noise with time musical noise and also it does not remove noise during the
and process to process, Speech Enhancement is a difficult problem
in Noisy environment. In this paper, we proposed a method to
silence period [2]. In Wiener filter based speech enhancement
improve the quality of speech based on combination of Digital method original speech signal is recovered by minimizing
Audio Effects with Improved Adaptive Kalman Filter when only Mean Square Error (MSE) between the clean speech and the
corrupted speech is available. In this approach to enhance the estimated signal [3]. Spectral and wiener filter based speech
Speech content in the Noisy speech signal, Digital audio effects are enhancement algorithms require the characteristics of clean
used. A Digital Expander generates an audio effect which operates speech. But in real time clean speech may not available in all
on a low signal level and create more likely sound characteristics.
And further, noise is removed by Auto Regressive modeled
the cases. From the literature study, we found that some of the
improved adaptive Kalman filter. The performance of the techniques have been proposed to enhance the speech. In [4]
proposed method with additive color noise is found to be better using harmonic structure of speech signal, speech is recovered
compared to other spectral subtraction, wiener and Kalman filter form noisy speech signal, in [5] sinusoidal model is adopted, in
methods in terms of Signal-to-Noise ratio and intelligibility. [6] MMSE estimator to enhance the speech was introduced by
Ephraim. At first The advantage in the use of Kalman filter for
Keywords— Kalman filter; intelligibility; Digital audio effect; digital
speech enhancement was proposed by K.K Paliwal and A.Basu
expander; Wiener filter; Spectral Subtraction..
by using estimation of speech signal parameters from clean
I. INTRODUCTION speech before it corrupted by white noise is proposed in [7].
And further extended to the random and colored noises [8]. In
The primary objective of many Speech Enhancement
these methods a tradeoff should be maintained between SNR
algorithms is to improve the perceptual quality of extracting
and intelligibility.
speech signal from noisy speech. Noise estimation is the major
Later, many changes are made to the Kalman filter for
component in speech enhancement techniques, because better
better improvement, it does not meet the expectations and also
noise estimation gives a high quality of speech extraction. Till
complexity is more. In this paper with less complexity and
now, removing noise from noisy speech is challenging issue
better performance a new adaptive Kalman filter based method
because spectral properties of non-stationary noise is very
with the combination of nonlinear digital filter called digital
difficult to estimate and predict. Noise estimation is a careful
expander is proposed to recover the speech signal from noisy
issue in speech enhancement algorithms since if the noise
speech. The additive noise is modeled as the AR process based
power is more than speech power, then that speech content may
on linear prediction coefficient estimation (LPC) in Kalman
be removed due to treating that as a noise. Due to the wide use
filtering algorithm [7]. In addition to coefficient estimation this
of Speech processing in many applications like
paper solved problem of de-noising the random and colored
teleconferencing systems, speech recognition based security
noises. We considered an assumption that the colored noise is
devices, biomedical signal processing, hearing aids, ATM
also an autoregressive process [8]. So we estimated its AR
machines and computers, Speech enhancement is a hot research
coefficients and variances by linear prediction estimation in the
area in signal processing and remains a challenging issue
same way.
because of most of the cases only the noisy speech is available
In this paper, to overcome above stated problem a new
[1]. Over the past years, researchers have developed different
adaptive Kalman filter based method with preprocessing of a
types of efficient algorithms to improve the noisy speech even
digital audio effecting technique called digital expander is
though still it poses a challenge to the researches because of
proposed to recover the speech signal from a sequence (frame)
characteristics of noise signal varies in a dramatic manner over
of noisy speech signals and the additive noise is modeled as the
time and application to application. There are many speech
AR process [9]. This estimation of time-varying auto regressive
enhancement techniques are proposed using filtering approach
(AR) speech model parameters are based on linear prediction
by researchers last ten years such as spectral subtraction
coefficient estimation (LPC). In addition to coefficient
method, wiener filtering, Kalman filter method and so on.
estimation, this paper solved problem of de-noising the colored Kalman Filtering:
noise. We made an assumption that the noise is also an Kalman filtering is one of the effective speech
autoregressive process [10]. So we estimated its AR enhancement technique, in which speech signal is usually
coefficients and variances by LPC in the same way. modeled as autoregressive (AR) model and represented in the
In this paper the content is organized as follows. In Section state-space domain. A Kalman filter is an estimation and
II we mentioned the theoretical and mathematical description updating process.
of proposed method. Section III is deals with Implementation In this process both the speech signal and the
and evaluation of the proposed method. Simulation results are
placed at the end. additive noise signals are treated as ( )
( ) respectively and expressed in terms of th order
autoregressive model (AR) as follows
( )= ( − )+ ( ) (1)
Digital Audio Effects:
Digital Audio Effects can be classified as Basic ( )= ( − )+ () (2)
Filtering, Time Varying Filters, Delays, Modulators, Non-
linear Processing, Spacial effects. Non-linear Digital Filters are
characterized by creating harmonic and inharmonic frequency
And Noisy speech can be expressed as
components which are not present in the original signal
intentionally or unintentionally. In Dynamic Processing signal
envelope is controlled to minimize harmonic distortion using ( )= ( )+ ( ) (3)
compressors or limiters.
Digital expander is a signal limiter which minimizes Where ( ) is the th sample of the speech single,
the distortion in the speech. Expander operates at low signal ( ) is the th sample of the additive noise, ( ) is the th
levels to boost the dynamics of the signal and it is useful to sample of noisy speech. And or AR model parameters.
create a more likely sound characteristic [9].
AR modeled speech signal can be expressed in State-
The signal x(n) is determined from the input with
space form shown below.
variable attack and release time data. The logarithm of this x(n)
signal is compared with the threshold value. If the signal is
( + 1) = ( ) ( ) + ( ( ), 0, … … … . ,0) (4)
above the threshold, then the difference is multiplied by the
negative slope of the limiter LS. Then the output is applied to
( ) … ( ) 0 … 0 0
antilogarithm. The control factor f (n) obtained is then ⎡ ⎤
1 … 0 0 … 0 0
smoothed with a first-order low pass filter. If the signal ( ) ⎢ ⎥
⎢ ⋮ ⋱ ⋮⎥
lies below the threshold level, then the signal ( ) is set to f (n) ( )=⎢ (5)
0 1 0 … 0 0⎥
= 1. The delayed input ( − 1) is multiplied by the 0 0 1 … 0 0⎥
smoothed control factor ( ) to give the output ( ). The ⎢ ⋮ ⋱ ⋮⎥
Figure (a) shows a digital expander block diagram. The ⎣ 0 … 0 0 … 1 0⎦
logarithm of the signal ( ) is taken and multiplied by 0.5.
The value obtained is compared with two thresholds in order to ( + 1) = ( ) ( ) + ( ( ), 0, … … … . ,0) (6)
determine the operating range of the static curve. If thresholds
are crossed, the resulting difference is multiplied by the ( ) … ( ) ( )
corresponding slope and antilogarithm of the result is taken. A 1 … 0 0
first-order low-pass filter subsequently provides the attack and ( )= (7)
⋮ ⋱ 0 ⋮
release time. 0 … 1 0
( ) ( )
( )= , ( )= (8)
( ) ( )
From Eq.(3) and (4)
( + 1) = ( ) ( ) + ( ) (9)
Figure (a): block diagram of Digital Expander
( )= ( )
= + (10)
= + (11)
Updating: compute Kalman gain, state vector update, parameter
covariance matrix update
The coefficients in the above equations are updated every time
frame by using following Discrete Kalman filter update
= ( + ) (12)
= + ( − ) (13)
= (1 − ) (14)
These parameters are updated for each iteration.
[1] "Speech Enhancement" by J. Benesty, J. Chen, Eds., and S. Makino,
Figure (6): Proposed Method output (Kalman with Digital Springer, Berlin, 2005.
Expander) and clean speech waveforms [2] S.Boll,“Suppression of acoustic noise in speech using spectral
subtraction,” IEEE Transaction on Speech, Signal Process., volume.
ASSP-27,no.2 pp.113-120,1979
[3] T.V.Sreenivas and P.kirnapure,“Codebook constrained Wiener filtering
for speech enhancement”, IEEE Trans. Speech and Audio processing,sep
[4] D.V.Anderson and M.A.Clements “Audio signal noise reduction using
harmonic modeling,” in Proc.IEEE Int. conf. Acoust. ICASSP,1999
[5] by J.Jensen and J.H.L Hansen,”Speech enhancement using a constrained
iterative sinusoidal model,”IEEE Trans. Speech Audio Process., vol.9
[6] Y.Epharaim,”A minimum mean square error approach for speech
enhancement,”in Proc. IEEE conf 1990.
[7] K.K.Paliwal and A.Basu,“A speech enhancement method based on
Kalman filtering” Proceedings of ICASSP’87,pp.177-180. Dallas, TX,
[8] B.Chabane and B.Daoued”On the use of Kalman Filter for enhancing
speech corrupted by Colored Noise,” WSEAS Trans. On signal
processing 2008 Dec.
[9] ”Digital Audio Signal Processing,” second edition, by Udo Zolzer
[10] “Noizeus- A noisy speech corpus for evaluation of speech enhancement
Figure (7): Spectrogram of clean & noisy speech signals,
enhancement,” IEEE Trans. On speech and audio processing, 2008
Kalman filter method, and proposed method