Multi_stage_Collaborative_Microphone_Arr
Multi_stage_Collaborative_Microphone_Arr
Multi_stage_Collaborative_Microphone_Arr
64
τ1 FAP Algorithm
τ2 d[n] + e[n]
Σ Delay Σ 1. Initialization: R0,k = δIP , ε̂0,k = 0, Ê0,k = 0
-
⌈P ⌉T ⌈P ⌉ ⌊P ⌋T ⌊P ⌋
τM z[n] 2. Rn,k = Rn−1,k + Xn,k Xn,k − Xn−1,k Xn−1,k
xk[n]
Collaborative
3. rn,k = [Rn,k [n − 1] , . . . , Rn,k [n − P + 1]]
Blocking
Adaptive Noise
Matrix T
Canceller 4. yk [n] = wn,k xn,k + rn,k ε̂n,k
5. ek [n] = d [n] − yk [n]
Figure 1: Microphone array beamforming architecture. µek [n]
6. En,k =
(1 − µ) Ên−1,k
noise vm [n]. The DSB spatially aligns the microphone sig- 7. Ên,k = [Ek [n] , . . . , Ek [n − P ]]T
nals with reference to the desired source direction, yielding the
speech reference signal d [n]: 8. gn,k = R−1
n,k En,k
M −1 L−1 0
εn,k = + gn,k
! !
d [n] = am [l] s [n − l] + vm [n] (1) 9.
m=0 l=0 ε̂n−1,k
where we suppose that each AIR between the desired source 10. ε̂n,k = [ε̂k [n] , . . . , ε̂k [n − P ]]T
and the m-th microphone has the same length denoted with L.
In the adaptive path of the beamformer, the blocking ma- 11. wn,k = wn−1,k + εk [n − P + 1] xn−P +1,k
trix (BM) generates the noise references xk [n], with k =
0, . . . , K − 1, being K = M − 1. The blocking matrix is im- Table 1: Summary of FAP algorithm.
plemented by pairwise differences between microphone signals
[9]. The noise reference signals are then processed by the col-
laborative ANC, whose structure will be described in the next (FAP) algorithm [6], which is summarized in Table 1, omitting
section. The task of the collaborative ANC is to remove the the MISO system index for a better comprehension.
residual noise components in the speech reference signal, min- It is well known that the combination of filters of different
imizing the output power and yielding the beamformer output families of algorithms can improve the tracking capabilities of
signal e [n]. the whole system [7]. In particular, important results can be
achieved combining a family of gradient-based algorithms and
a family of Hessian-based algorithms [7]. Taking into account
3. Collaborative Adaptive Noise Canceller this point, a first distinction between the four MISO systems
The trademark of the proposed beamforming technique is repre- can be made choosing different values for the projection order.
sented by the structure of the collaborative ANC. Generally, an In fact, for P (j) = 1 the FAP algorithm turns into the NLMS
ANC is composed of an adaptive filter bank forming a MISO algorithm yielding gradient-based properties, while for P (j) >
system. The adopted architecture, depicted in Fig. 2, is a 1 the FAP algorithm preserves its Hessian nature. Therefore,
multi-stage convex combination of adaptive filters. In partic- we set P (j) = P1 = 1 for j = 1, 2, and P (j) = P2 > 1
ular, the structure is composed of four different MISO systems, for j = 3, 4. This choice will affect the second-stage convex
each bringing different filtering capabilities to the whole beam- combination. The second stage combination is a system-by-
former. Each MISO system receives the same input signals, system combination scheme. On the other hand, the convex
which are the noise reference signals coming from the BM. The combination of the first-stage will involve the MISO systems
j-th MISO system can represent the input signals in an L×P (j) having the same projection order. In particular, the first stage
(j) involves two different convex combinations, one for systems
reference noise matrix Xn,k :
j = 1, 2 and another one for systems j = 3, 4. In this case we
(j) "
Xn,k = xn,k xn−1,k . . . xn−P (j) +1,k
#
(2) differentiate the systems according to the step size value µ(j) :
we choose a small step size µ(j) = µ1 for j = 1, 3 and a large
step size µ(j) = µ2 for j = 2, 4. In this way we further improve
' (
xk [n] ··· xk n − P (j) + 1
' ( the mean-square performance of the adaptive filtering [8]. The
xk [n − 1] ··· xk n − P (j)
kind of combination scheme performed in the first stage is the
= .. ..
filter-by-filter scheme.
..
. . .
Let i = 1, 2 the index which refers to the convex combi-
' (
xk [n − L + 1] · · · xk n − P (j) − L + 2 nation of the first stage. As it is possible to see in Fig. 2, con-
sidering the i-th combination, the k-th filter output of the first
where P (j) represents the projection order for all the fil- MISO system, is convex combined with the correspondent k-th
(j) filter output of the second MISO system, yielding K outputs,
ters of the j-th MISO system. We denote with wn,k =
(i)
'
(j) (j) (j)
(T denoted as zk [n], each related to a noise reference:
wk [n] , wk [n − 1] , . . . , wk [n − L + 1] the L × 1 co- , -
(i) (i) (j) (i) (j+1)
efficient vector of the k-th filter belonging to the j-th MISO zk [n] = λk [n] yk [n] + 1 − λk [n] yk [n] (3)
system, with j = 1, . . . , 4, at n-th time instant. Each filter
of the ANC is adapted according to the fast affine projection where, in this case, the system index is j = 1 when i = 1,
65
NLMS MISO Systems
where η [n] is the mixing parameter of the second stage, adapted
x0[n] (1)
y0 [n]
using an auxiliary parameter, similarly to (4).
(1)
wn,0 (1)
0[n] Once computing the second stage convex combination, it is
x1[n] (1)
possible to derive the overall beamformer output signal e [n]:
y1 [n]
(1)
wn,1 Σ
(1)
1[n] (1)
0[n] e [n] = d [n] − z [n] . (6)
(1)
z0[n]
Σ
xK-1[n]
(1)
wn,K-1
(1)
yK-1 [n] The multi-stage collaborative architecture presented above im-
(1)
(1)
z1[n] proves the tracking capabilities of the ANC [7], giving robust-
1[n]
Σ ness to the overall beamforming system in presence of nonsta-
(2)
wn,0 (1)
tionary interfering signals.
(2) K-1[n]
y0 [n] (1)
zK-1[n]
(2)
wn,1 (2)
Σ 4. Simulation Results
y1 [n]
(1)
K-1[n]
n In the this section we carry out two different sets of experi-
ments: the first set aims to assess the effectiveness of the multi-
(2)
wn,K-1 (2)
yK-1 [n]
stage collaborative filtering adopted in the proposed beamform-
z[n]
ing architecture; the second set of experiments is performed to
Σ evaluate the proposed beamforming architecture for speech en-
FAP MISO Systems
hancement application in multisource environments. Both the
(3)
(3)
y0 [n] experiments take place in a 10 × 6, 6 × 3 m room with a rever-
wn,0
beration time of T60 = 150 ms.
(2)
0[n]
n
(3)
y1 [n]
(3) Σ
wn,1 (2)
1[n] (2)
4.1. Evaluation of the Multi-stage Collaborative Filtering
0[n]
(2)
z0[n] In the first set of experiments, in order to prove the effectiveness
Σ
(3)
wn,K-1
(3)
yK-1 [n] of the multi-stage collaborative filtering, we analyze a single-
(2)
(2)
z1[n] channel (i.e. K = 1) acoustic echo cancelling application, in
1[n]
Σ which the acoustic environment changes due to a nonstationary
(4)
wn,0 (2)
source or to an alteration in the environemental conditions. The
(4) K-1[n]
y0 [n] (2)
zK-1[n] AIR is simulated by means of Roomsim, which is a Matlab tool
(4) Σ
[10]; the AIR is measured by using an 8 kHz sampling rate and
wn,1 (4)
y1 [n] it is truncated after L = 300 samples. The length of the experi-
(2)
K-1[n]
ment is t = 10 s. Furthermore, an independent white Gaussian
(4)
noise with zero mean and unit variance is added as background
wn,K-1 (4)
yK-1 [n] noise, in order to provide 20 dB of signal to noise ratio (SNR).
In order to introduce an abrupt change in the environment, we
shift the AIR circularly to the right by 50 samples, 5 s after the
Figure 2: Multi-stage collaborative adaptive noise canceller. start of the adaptive process. We choose the following parame-
ter settings: µ1 = 0.1, µ2 = 0.9, P1 = 1, P2 = 2, δ = 30σx2k ,
(i)
where σx2k is the power of the input signal. In order to measure
and j = 3 when i = 2. In (3), λk [n] represents the k-th the filtering performance we use the normalized misalignment
mixing parameter of the i-th combination of the first stage, and M, expressed in dB, defined as:
it is updated using a gradient descent rule through the adapta- 1 1
(i) (i) (j) 1
, ak [n],
tion of an auxiliary parameter, - related to λk [n] by the 1hn − ĥn,k 1
1
(i) (i) 2
M = 20 log10 (7)
expression λk [n] = sgm ak [n] , according to [8]: #hn #2
(i) (i)
ak [n + 1] = ak [n] + (4)
(j)
where hn is the AIR column vector, and ĥn,k is the estimated
µa (j) (i) (i)
,
(i)
-
+ (i)
ek [n + 1] ∆ek [n + 1] λk [n] 1 − λk [n] filter.
qk [n] Figure 3 displays the performance results; it is possible
(i)
where ∆ek [n + 1] = ek
(j+1)
[n + 1] − ek [n + 1], µa is a
(j) to see that the multi-stage collaborative filtering exploits the
common step size value for the adaptation of each auxiliary pa- tracking capabilities of all the four filters, always taking the be-
(i) (i)
,
(i)
-2 haviour of the best performing filtering. Furthermore, in Fig. 4
rameter; qk [n] = βqk [n − 1] + (1 − β) ∆ek [n + 1] it is possbile to notice the behaviour of the three mixing parame-
is the estimated power of ∆ek [n + 1], and β is a smoothing
(i) ters, λ(1) [n] and λ(2) [n] related to the first-stage combination,
factor. and η [n] related to the second-stage combination. Observing
In the second stage a system-by-system convex combina- Fig. 4 it’s still more easy to comprehend the collaboration be-
tion is carried out between the two outputs yielded by the first tween the four different filterings.
stage. The second-stage output signal, denoted with z [n], rep-
resents the overall ANC output: 4.2. Evaluation of the Collaborative Beamformer
66
5 1
MISO System #1 − P=1, µ=0.1
λ(1)[n]
0 MISO System #2 − P=1, µ=0.9 0.5
MISO System #3 − P=2, µ=0.1
Normalized Misalignment [dB]
λ(2)[n]
−15 0.5
−20 0
0 1 2 3 4 5 6 7 8 9 10
−25
1
η[n]
−30
0.5
−35
0 1 2 3 4 5 6 7 8 9 10 0
Time [sec]
0 1 2 3 4 5 6 7 8 9 10
Time [sec]
67