chapter_3 Performance Surface and Search Method

2/19/2023
CHAPTER THREE
Ch3_Eigenanalysis and Performance Surface
Ch4_Search Methods
adaptive filters theory and applications
Behrouz Farhang-Boroujeny
EIGENVALUES AND EIGENVECTORS

 Let
 A nonzero N-by-1 vector q is said to be an eigenvector of R,

if it satisfies the equation
 for some scalar constant λ. The scalar λ is called the

eigenvalue of R associated with the eigenvector q.
 We note that if q is an eigenvector of R, then for any
nonzero scalar a, aq is also an eigenvector of R,
corresponding to the same eigenvalue, λ
 To find the eigenvalues and eigenvectors of R, we note that
Eq. (4.2) may be rearranged as 2
1
2/19/2023
2 0
Example: find eigenvalues and eigenvectors of 𝑅 = .
0 3
Solution:
2 0 1 0
det(𝑅 − 𝜆𝐼) = 0 ⇒ det( −𝜆 )=0
0 3 0 1
2−𝜆 0
det( ) = 0 ⇒ (2 − 𝜆)(3 − 𝜆)=0
0 3−𝜆
𝜆 = 2 𝑎𝑛𝑑 𝜆 = 3
For 𝜆 = 2
2−2 0 𝑥
(𝑅 − 𝜆𝐼)𝑞 = 0 ⇒ ( . 𝑦 =0
0 3−2
0∗𝑥+0∗𝑦 0
= ⇒∴ 𝑦 =0
0∗𝑥+1∗𝑦 0
𝒙 𝟏
𝒒𝟏 = 𝒘𝒉𝒆𝒓𝒆 𝒙 ≠ 𝟎 𝒇𝒐𝒓 𝒔𝒊𝒎𝒑𝒍𝒊𝒄𝒊𝒕𝒚 𝒒𝟏 =
𝟎 𝟎
At the same manner for For 𝜆 = 3
𝟎 𝟎
𝒒𝟐 = 𝒘𝒉𝒆𝒓𝒆 𝒚 ≠ 𝟎 𝒇𝒐𝒓 𝒔𝒊𝒎𝒑𝒍𝒊𝒄𝒊𝒕𝒚 𝒒𝟐 =
𝒚 𝟏 4
2
2/19/2023
Example 1 : Find the eigenvalues and the eigenvectors

1 3
of the matrix 𝐴 = .
4 2
Solution:
1 3 1 0 1 3  0
𝐴 − 𝜆. 𝐼 = − . = −
4 2 0 1 4 2 0 
1− 3
𝐴 − 𝜆. 𝐼 =
4 2−
In general to form A - I from A one simply subtracts 
from the diagonal entries.
∴The characteristic equation is
1− 3
𝐴 − 𝜆. 𝐼 = = 1− 2− −3∗4 = 0
4 2−
∴ 2 −  − 2 +  − 12 = 0
 − 3 − 10 = 0
So the eigenvalues are
5
1 = 5 and 2 = - 2
For 1 = 5
𝑥
Let 𝑉 = 𝑦 is the eigenvector of the matrix A
corresponding to 1 = 5.
∵ 𝐴 − 𝜆. 𝐼 . 𝑉 = 0
1 3 1 0 𝑥 0
∴ −5 . 𝑦 =
4 2 0 1 0
1−5 3 𝑥 0
. =
4 2−5 𝑦 0
−4𝑥 + 3𝑦 0
=
4𝑥 − 3𝑦 0
So
- 4x + 3y = 0
4x - 3y = 0 6
3
2/19/2023
Since the second equation is the negative of the

first, any solution to the first equation is also a
solution to the second. So it suffices to solve the
first equation.
3
𝑥= 𝑦
4
y 1 4 8 …..
x 3/4 3 6 ……
𝟑
So any multiple of the vector 𝑽𝟏 = is an eigenvector
𝟒
for 1 = 5.
7
For 2 = - 2
𝑥
Let 𝑉 = 𝑦 is the eigenvector of the matrix A
corresponding to 2 = -2.
∵ 𝐴 − .𝐼 .𝑉 = 0
1 3 1 0 𝑥 0
∴ +2 . 𝑦 =
4 2 0 1 0
1+2 3 𝑥 0
. =
4 2+2 𝑦 0
3𝑥 + 4𝑦 0
=
4𝑥 + 4𝑦 0
So
3x + 3y = 0
4x +4y = 0 8
4
2/19/2023
 Since the second equation is the multiple of the

first, any solution to the first equation is also a
solution to the second. So it suffices to solve the
first equation.
x = -y
y -1 1 2 …..
x 1 -1 -2 ……
1
any multiple of the vector 𝑉 = is an eigenvector for 2 = -2.
−1
PROPERTIES OF
EIGENVALUES AND EIGENVECTORS
 ome of the properties derived here are directly related
to the fact that the correlation matrix R is Hermitian
and nonnegative definite.
 A matrix A, in general, is said to be Hermitian (or
self-adjoint matrix) if 𝑨 = 𝑨 = 𝐴 .
 The N-by-N Hermitian matrix A is said to be
nonnegative definite or positive semidefinite, if :
𝑣 . 𝐴. 𝑣 ≥ 0
 The fact that A is Hermitian implies that 𝒗𝑯 . 𝑨. 𝒗 is
real-valued.
 the correlation matrix R is almost always
positive definite.
10
5
2/19/2023
1. The eigenvalues of the correlation matrix R are

all real and nonnegative.
2. If qi and qj are two eigenvectors of the
correlation matrix R that correspond
to two of its distinct eigenvalues, then:
In other words, eigenvectors associated with the

distinct eigenvalues of the correlation
matrix R are mutually orthogonal.
3. Assume the eigenvectors q0, q1, . . . , qN-1 are all
normalized to have a length of unity, and define the
N-by-N matrix 𝑄 = [𝑞 𝑞 … 𝑞 ] is then a unitary
matrix, i.e., 𝑄 𝑄 = 𝐼, This implies that the matrices
Q and 𝑄 are the inverse of each other. 11
4. For any N-by-N correlation matrix R, one can

always find a set of mutually orthogonal
eigenvectors. Such a set may be used as a basis
to express any vector in the N-dimensional
space of complex vectors.
5. Unitary Similarity Transformation. The
correlation matrix R can always be decomposed
as:
12
6
2/19/2023
6. Let λ0, λ1, . . . , λN−1 be the eigenvalues of the

correlation matrix R. Then,
where tr[R] denotes trace of R and is defined as the

sum of the diagonal elements of R.
7. Minimax Theorem:
for i = 1, 2, . . . , N - 1
13
8. The eigenvalues of the correlation matrix R of a

discrete-time stationary stochastic process {x(n)}
are bounded by the minimum and maximum
values of the power spectral density, Φ (𝑒 ),
of the process.
14
7
2/19/2023
9. Karhunen–Lo´eve expansion
15
THE CANONICAL FORM OF

THE ERROR-PERFORMANCE SURFACE
 We recall from the last lecture the performance

function {mean-squared error (MSE)} of a
transversal Wiener filter with a real-valued input
sequence x(n) and a desired output sequence d(n)
is
 Also, we recall that the optimum value of the

Wiener filter tap-weight vector is obtained from
the Wiener-Hopf equation
16
8
2/19/2023
 The performance function ξ may be

rearranged as follows:
17
 we use eigen-decomposition to express the

correlation matrix R of the tap-input vector in
terms of its eigenvalues and associated
eigenvectors (see Appendix E in Haykin Textbook)
18
9
2/19/2023
 This new formulation of the mean-square error

contains no cross-product terms, as shown by
 where vk is the kth component of the vector v.
19
 Example 4.3: Consider the case where a two-tap

transversal Wiener filter is characterized by the
following parameters:
We want to explore the performance surface of this

filter for values of α ranging from 0 to 1.
The performance function of the filter is obtained
by substituting the above parameters in Eq. (4.81).
This gives
20
10
2/19/2023
 Solving the Wiener–Hopf equation to obtain the

optimum tap weights of the filter, we obtain:
21
 To convert this to its canonical form, we should

first find the eigenvalues and eigenvectors of R.
To find the eigenvalues of R, we should solve the
characteristic equation
22
11
2/19/2023
23
24
12
2/19/2023
25
SEARCH METHODS
1. METHOD OF STEEPEST DESCENT
26
13
2/19/2023
 We recall from Chapter 2 that the optimum tap-

weight vector wo is the one that minimizes the
performance function
 where e(n) = d(n) - y(n) is the estimation error of the

Wiener filter. Also, we recall that the performance
function ξ can be expanded as:
 Here, we assume that R and p are available.

 however, this approach need difficult arithmetic
circuits (requiring the (computationally challenging)
inversion of the matrix Rx) not suitable for many
applications, therefore in the next section we
introduce another approach to find the tap-weight
vector w.
27
ITERATIVE SEARCH METHOD

 In this chapter we present a set of algorithms that
iteratively search for the minimum of the cost function.
 They do this based (at least) on the gradient of the cost
function, so they are often called deterministic gradient
algorithms.
 In order for the cost function to depend only on the filter w,
the statistics Rx and rxd must be given.
 In this way, these algorithms solve the Wiener-Hopf
equation iteratively, most of them without requiring the
(computationally challenging) inversion of the matrix Rx.
 However, all the information from the environment is
captured in the second-order statistics and these
algorithms do not have a learning mechanism for adapting
to changes in the environment. In the next chapter we will
see how adaptive filters solve this issue. 28
14
2/19/2023
STEEPEST DESCENT ALGORITHM
 The method of steepest descent is a general scheme

that uses the following steps to search for the
minimum point of any convex function of a set of
parameters:
1. Start with an initial guess of the parameters whose
optimum values are to be found for minimizing the
function.
2. Find the gradient of the function with respect to
these parameters at the present point.
3. Update the parameters by taking a step in the
opposite direction of the gradient vector obtained in
Step 2. This corresponds to a step in the direction of
steepest descent in the cost function at the present
point. Furthermore, the size of the step taken is
chosen proportional to the size of the gradient
vector.
4. Repeat Steps 2 and 3 until no further significant 29
change is observed in the parameters.
 To implement this procedure in the case of the

transversal filter shown in Figure 5.1, we recall
from Chapter 2 that
30
15
2/19/2023
 As we shall soon show, the convergence of w(k) to

the optimum solution wo and the speed at which
this convergence takes place are dependent on
the size of the step-size parameter μ.
 A large step-size may result in divergence of this
recursive equation
 where I is the N-by-N identity matrix.
31
THE V-AXES
 we substitute for p from Eq. (5.6). Also,we

subtract wo from both sides of Eq. (5.11) and
rearrange the result to obtain
 This is the tap-weight update equation in terms

of the v-axes
32
16
2/19/2023
THE V’-AXES
 Recall that R has the following unitary similarity
decomposition
33
 The vector recursive Eq. (5.18) may be separated

into the scalar recursive equations
 the step-size parameter μ is selected so that
34
17
2/19/2023
35
Starting with an initial value w(0) = [w0(0) w1(0)]T and letting the
recursive equation (5.29) to run, we get two sequences of the tap-weight
variables w0(k) and w1(k). 36
18
2/19/2023
LEARNING CURVE
 The curve obtained by plotting ξ(k) as a function

of the iteration index, k, is called learning curve.
A learning curve of the steepest-descent
algorithm, as can be seen from Eq. (5.31),
consists of a sum of N exponentially decaying
terms, each of which corresponds to one of the
modes of convergence of the algorithm.
 Each exponential term may be characterized by a

time constant, which is obtained as follows.
37
38
19
2/19/2023
39
The existence of two distinct time constants on the learning curve in

Figure 5.5 is clearly observed.
40
20
2/19/2023
41
42
21
2/19/2023
EFFECT OF EIGENVALUE SPREAD

 Our study in the last two sections shows that the
performance of the steepest-descent algorithm is highly
dependent on the eigenvalues of the correlation matrix R.
 In general, a wider spread of the eigenvalues results in a
poorer performance of the steepest-descent algorithm.
 To gain further insight into this property of the steepest-
descent algorithm, we find the optimum value of the step-
size parameter μ, which results in the fastest possible
convergence of the steepest-descent algorithm.
43
THE GEOMETRICAL RATIO FACTOR
44
22
2/19/2023
NEWTON’S METHOD
 Our discussions in the last few sections show that
the steepest-descent algorithm may suffer from
slow modes of convergence, which arise due to
the spread in the eigenvalues of the correlation
matrix R.
 This means that if we can somehow get rid of the
eigenvalue spread, we can get a much better
convergence performance. This is exactly what
Newton’s method does.
45
 To derive Newton’s method for the quadratic

case, we start from the steepest descent
algorithm given in Eq. (5.10). Using p = Rwo, Eq.
(5.10) becomes:
46
23
2/19/2023
47
48
24

chapter_3 Performance Surface and Search Method

Uploaded by

Copyright:

Available Formats

chapter_3 Performance Surface and Search Method

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

chapter_3 Performance Surface and Search Method

Uploaded by

Copyright:

Available Formats

2/19/2023

EIGENVALUES AND EIGENVECTORS

 A nonzero N-by-1 vector q is said to be an eigenvector of R,

 for some scalar constant λ. The scalar λ is called the

Example 1 : Find the eigenvalues and the eigenvectors

Since the second equation is the negative of the

 Since the second equation is the multiple of the

1. The eigenvalues of the correlation matrix R are

In other words, eigenvectors associated with the

4. For any N-by-N correlation matrix R, one can

6. Let λ0, λ1, . . . , λN−1 be the eigenvalues of the

where tr[R] denotes trace of R and is defined as the

8. The eigenvalues of the correlation matrix R of a

THE CANONICAL FORM OF

 We recall from the last lecture the performance

 Also, we recall that the optimum value of the

 The performance function ξ may be

 we use eigen-decomposition to express the

 This new formulation of the mean-square error

 where vk is the kth component of the vector v.

 Example 4.3: Consider the case where a two-tap

We want to explore the performance surface of this

 Solving the Wiener–Hopf equation to obtain the

 To convert this to its canonical form, we should

 We recall from Chapter 2 that the optimum tap-

 where e(n) = d(n) - y(n) is the estimation error of the

 Here, we assume that R and p are available.

ITERATIVE SEARCH METHOD

STEEPEST DESCENT ALGORITHM

 The method of steepest descent is a general scheme

 To implement this procedure in the case of the

 As we shall soon show, the convergence of w(k) to

 where I is the N-by-N identity matrix.

 we substitute for p from Eq. (5.6). Also,we

 This is the tap-weight update equation in terms

 The vector recursive Eq. (5.18) may be separated

 the step-size parameter μ is selected so that

 The curve obtained by plotting ξ(k) as a function

 Each exponential term may be characterized by a

The existence of two distinct time constants on the learning curve in

EFFECT OF EIGENVALUE SPREAD

THE GEOMETRICAL RATIO FACTOR

 To derive Newton’s method for the quadratic

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.