Lect 3

Mathematical Foundations for Data
Science
BITS Pilani MFDS Team
Pilani Campus
BITS Pilani
Pilani Campus
DSECL ZC416, MFDS
Lecture No. 3
Agenda
•  Eigenvalues and eigenvectors
•  Gerschgorin’s theorem
•  Similarity transformation
•  Diagonalization of matrices
•  Quadratic forms
BITS Pilani, Pilani Campus

Eigenvalue Problem
A matrix eigenvalue problem considers the vector equation
(1) Ax = λx,
where A is a given square matrix, λ an unknown scalar(real
or complex), and x an unknown vector.
The task is to determine λ’s and x’s (dependent on λ’s)that
satisfy (1).
Since x = 0 is always a solution for any λ,we only admit
solutions with x ≠ 0.
The solutions to (1) are given the following names: The λ’s
that satisfy (1) are called eigenvalues of A and the
corresponding nonzero x’s that also satisfy (1) are called
eigenvectors of A.

Spectrum
The set of all the eigenvalues of A is called the spectrum of

A. We shall see that the spectrum consists of at least one
eigenvalue and at most of n numerically different
eigenvalues.

The largest of the absolute values of the eigenvalues of A is
called the spectral radius of A, a name to be motivated later.

Determination of Eigen Value
and Eigen Vector
⎡ −5 2 ⎤
A=⎢ ⎥ .
⎣ 2 −2 ⎦
Solution.
(a) Eigenvalues. These must be determined first.
Equation (1) is in components
⎡ −5 2 ⎤ ⎡ x1 ⎤ ⎡ x1 ⎤
Ax = ⎢ ⎥ ⎢ ⎥ = λ ⎢ ⎥;
⎣ 2 −2 ⎦ ⎣ x2 ⎦ ⎣ x2 ⎦

−5 x1 + 2 x2 = λ x1
2x − 2x = λ x .
1 2 2

Example
Solution. (continued 1)
(a) Eigenvalues. (continued 1)
Transferring the terms on the right to the left, we get
( −5 − λ ) x1 + 2 x2 = 0
(2*)
2 x1 + ( −2 − λ ) x2 = 0
This can be written in matrix notation
(3*) (A − λI)x = 0
Because (1) is Ax − λx = Ax − λIx = (A − λI)x = 0,
which gives (3*).

Example
We see that this is a homogeneous linear system. By
Cramer’s theorem in Sec. 7.7 it has a nontrivial solution (an
eigenvector of A we are looking for) if and only if its
coefficient determinant is zero, that is,
−5 − λ 2
D(λ ) = det( A − λI) =
2 −2 − λ
(4*)
= ( −5 − λ )( −2 − λ ) − 4 = λ 2 + 7 λ + 6 = 0.

Example
We call D(λ) the characteristic determinant or, if expanded,
the characteristic polynomial, and D(λ) = 0 the
characteristic equation of A. The solutions of this quadratic
equation are λ1 = −1 and λ2 = −6. These are the eigenvalues
of A.
(b1) Eigenvector of A corresponding to λ1. This vector is
obtained from (2*) with λ = λ1 = −1, that is,
−4 x1 + 2 x2 = 0

2 x1 − x2 = 0.

Example
(b1) Eigenvector of A corresponding to λ1. (continued)
A solution is x2 = 2x1, as we see from either of the two
equations, so that we need only one of them. This
determines an eigenvector corresponding to λ1 = −1 up to a
scalar multiple. If we choose x1 = 1, we obtain the
eigenvector
⎡ ⎤ ⎡ −5 2 ⎤⎡ 1 ⎤ ⎡ −1 ⎤
v = ⎢ 1
⎥ , Check: Av = ⎢ ⎥⎢ ⎥=⎢ ⎥ = (−1)v = λ1v.
⎣ 2 ⎦ ⎣ 2 −2 ⎦⎣ 2 ⎦ ⎣ −2 ⎦

Example
(b2) Eigenvector of A corresponding to λ2.
For λ = λ2 = −6, equation (2*) becomes
x1 + 2 x2 = 0
2 x1 + 4 x2 = 0.
A solution is x2 = −x1/2 with arbitrary x1. If we choose x1 = 2,
we get x2 = −1. Thus an eigenvector of A corresponding to
λ2 = −6 is
⎡ 2 ⎤ ⎡ −5 2 ⎤⎡ 2 ⎤ ⎡ −12 ⎤
w = ⎢ ⎥ , Check: Aw = ⎢ ⎥⎢ ⎥=⎢ ⎥ = (−6)w = λ2 w.
⎣ −1 ⎦ ⎣ 2 −2 ⎦⎣ −1 ⎦ ⎣ 6 ⎦


Eigen Value Analysis
This example illustrates the general case as follows. Equation (1) written in
components is a11x1 +!+ a1n xn = λ x1

a21x1 +!+ a2n xn = λ x2

!!!!!!!!

an1x1 +!+ ann xn = λ xn .
Transferring the terms on the right side to the left side, we have

(2) (a11 − λ )x1 + a12 x2 +!+ a1n xn = 0

a21x1 + (a22 − λ )x2 +!+ a2n xn = 0
!!!!!!!!!!!!!!!
an1x1 + an2 x2 +!+ (ann − λ )xn = 0.

In matrix notation,
(3) ( A − λI)x = 0.
By Cramer’s theorem in Sec. 7.7, this homogeneous linear
system of equations has a nontrivial solution if and only if
the corresponding determinant of the coefficients is zero:

a11 − λ a12 ! a1n
(4) a21 a22 − λ ! a2n
D( λ ) = det(A − λI) = = 0.

⋅ ⋅ ! ⋅

an1 an2 ! ann − λ


A − λI is called the characteristic matrix and D(λ) the

characteristic determinant of A. Equation (4) is called the
characteristic equation of A. By developing D(λ) we obtain
a polynomial of nth degree in λ. This is called the
characteristic polynomial of A.

Eigen Values
Theorem 1
Eigenvalues
The eigenvalues of a square matrix A are the roots of the
characteristic equation (4) of A.
Hence an n × n matrix has at least one eigenvalue and at most n
numerically different eigenvalues.
The eigenvalues must be determined first.

Once these are known, corresponding eigenvectors are
obtained from the system (2), for instance, by the Gauss
elimination, where λ is the eigenvalue for which an
eigenvector is wanted.

Eigen Space
Theorem 2
Eigenvectors, Eigenspace
If w and x are eigenvectors of a matrix A corresponding to the
same eigenvalue λ, so are w + x (provided x ≠ −w) and kx for
any k ≠ 0.
Hence the eigenvectors corresponding to one and the same
eigenvalue λ of A, together with 0, form a vector space called the
eigenspace of A corresponding to that λ.

Multiple Eigen Values
Example 2: Find the eigenvalues and eigenvectors of

⎡ −2 2 −3 ⎤
A = ⎢⎢ 2 1 −6 ⎥⎥ .

⎢⎣ −1 −2 0 ⎥⎦

Solution.
For our matrix, the characteristic determinant gives the
characteristic equation
−λ3 − λ2 + 21λ + 45 = 0.
The roots (eigenvalues of A) are λ1 = 5, λ2 = λ3 = −3.

To find eigenvectors, we apply the Gauss elimination (Sec.
7.3) to the system (A − λI)x = 0, first with λ = 5
and then with λ = −3 . For λ = 5 the characteristic matrix is
⎡ −7 2 −3 ⎤
A − λI = A − 5I = ⎢⎢ 2 −4 −6 ⎥⎥ .

⎢⎣ −1 −2 −5 ⎥⎦

It row-reduces to ⎡ −7 2 −3 ⎤
⎢ 0 −24 / 7 −48 / 7 ⎥ .
⎢ ⎥
⎢⎣ 0 0 0 ⎥⎦

Hence it has rank 2. Choosing x3 = −1 we have x2 = 2 from
24 48
− x
2
− x 3 = 0 and then x1 = 1 from −7x1 + 2x2 − 3x3 = 0.
7 7
Hence an eigenvector of A corresponding to λ = 5 is
x1 = [1 2 −1]T.
For λ = −3 the characteristic matrix
⎡ 1 2 −3 ⎤
A − λI = A + 3I = ⎢ 2 4 −6 ⎥
⎢ ⎥
⎡ 1 2 −3 ⎤
⎢
⎣ − 1 −2 3 ⎥
⎦ ⎢0 0 0 ⎥ .
row-reduces to ⎢ ⎥
⎢⎣0 0 0 ⎥⎦

Hence it has rank 1.
From x1 + 2x2 − 3x3 = 0 we have x1 = −2x2 + 3x3. Choosing
x2 = 1, x3 = 0 and x2 = 0, x3 = 1, we obtain two linearly
independent eigenvectors of A corresponding to λ = −3 [as
they must exist by (5), Sec. 7.5, with rank = 1 and n = 3],
⎡ −2 ⎤ ⎡3⎤
x 2 = ⎢⎢ 1 ⎥⎥ and x 3 = ⎢⎢0 ⎥⎥ .
⎢⎣ 0 ⎥⎦ ⎢⎣ 1 ⎥⎦


Gerschgorin’s Theorem
Theorem gives the bound on Eigenvalues
Every eigenvalue of matrix Anxn satisfies :
λ – {aii} ≤ Σ |{aij}|, i = 1,2,....n
j≠ i ⎡
⎢ 0
1 1 ⎤
⎥
2 2 ⎥
Example : A = ⎢ 1
⎢
⎥
⎢ 5 1 ⎥
⎢ 2
⎢ 1
⎥
⎥
⎢ 1 1 ⎥
⎣ 2 ⎦
We get Gerschgorin disks D1: Centre 0, radius 1

D2 : Centre 5, radius 1.5
D3 : Centre 1, radius 1.5
The centers are main diagonal entries of A. These would
be the eigenvalues of A if A were diagonal
Algebraic Multiplicity
&Geometric Multiplicity
The order Mλ of an eigenvalue λ as a root of the
characteristic polynomial is called the algebraic
multiplicity of λ. The number mλ of linearly independent
eigenvectors corresponding to λ is called the geometric
multiplicity of λ. Thus mλ is the dimension of the
eigenspace corresponding to this λ.
Since the characteristic polynomial has degree n, the sum
of all the algebraic multiplicities must equal n. In Example 2
for λ = −3 we have mλ = Mλ = 2. In general, mλ ≤ Mλ, as can
be shown. The difference Δλ = Mλ − mλ is called the defect of
λ. Thus Δ−3 = 0 in Example 2, but positive defects Δλ can
easily occur.

Special cases
Theorem 3
Eigenvalues of the Transpose
The transpose AT of a square matrix A has the same eigenvalues
as A.

Basis of Eigenvectors

If an n × n matrix A has n distinct eigenvalues, then A has a basis
of eigenvectors x1, … , xn for Rn.

Eigenvectors corresponding to Distinct
Eigenvalues are Linearly Independent
Let k be the smallest positive integer such that v1, v2, . . . , vk are linearly
independent. If k = p, nothing is to be proved.
If k < p, then vk+1 is a linear combination of v1, . . . , vk; that is, there exist constants
c1,c2,...,ck such that
vk+1 =c1v1 +c2v2 +···+ckvk.
Applying the matrix A to both sides, we have
Avk+1 = λk+1 vk+1
= λk+1 (c1v1 +c2v2 +···+ckvk)
= c1λk+1v1 + c2λk+1v2 + · · · + ckλk+1vk;
Avk+1 = A(c1v1 +c2v2 +···+ckvk)
= c1Av1 +c2Av2 +···+ckAvk
= c1λ1v1 +c2λ2v2 +···+ckλkvk.
Thus c1(λk+1 −λ1)v1 +c2(λk+1 −λ2)v2 +···+ck(λk+1 −λk)vk =0.
Since v1,v2,...,vk are linearly independent, we have
c1(λk+1 −λ1) = c2(λk+1 −λ2)=···= ck(λk+1 −λk)=0.
Note that the eigenvalues are distinct. Hence
c1 =c2 =···=ck =0,
which implies that vk+1 is the zero vector. This is contradictory to vk+1 ≠ 0.

Similarity of Matrices
Similar Matrices. Similarity Transformation
An n × n matrix Â is called similar to an n × n matrix A if

(4) Â = P−1AP
for some (nonsingular!) n × n matrix P. This transformation,
which gives Â from A, is called a similarity transformation.
Eigenvalues and Eigenvectors of Similar Matrices
If Â is similar to A, then Â has the same eigenvalues as A.

Furthermore, if x is an eigenvector of A, then y = P−1x is an
eigenvector of Â corresponding to the same eigenvalue.

Diagonalization of a Matrix
If an n × n matrix A has a basis of eigenvectors, then

(5) D = X−1AX
is diagonal, with the eigenvalues of A as the entries on the main
diagonal. Here X is the matrix with these eigenvectors as column
vectors. Also,
(5*) Dm = X−1AmX (m = 2, 3, … ).

Diagonalize

⎡ 7.3 0.2 −3.7 ⎤
A = ⎢⎢ −11.5 1.0 5.5 ⎥⎥ .
⎢⎣ 17.7 1.8 −9.3 ⎥⎦

Solution.
The characteristic determinant gives the characteristic
equation −λ3 −λ2 + 12λ = 0. The roots (eigenvalues of A)
are λ1 = 3, λ2 = −4, λ3 = 0. By the Gauss elimination applied
to (A − λI)x = 0 with λ = λ1, λ2, λ3 we find eigenvectors and
then X−1 by the Gauss–Jordan elimination

The results are
⎡ −1⎤ ⎡ 1 ⎤ ⎡2⎤ ⎡ −1 1 2⎤
⎢ 3 ⎥ , ⎢ −1 ⎥ , ⎢1⎥ ,
⎢ ⎥ ⎢ ⎥ ⎢ ⎥ X = ⎢⎢ 3 −1 1 ⎥⎥ ,

⎢⎣ −1⎥⎦ ⎢⎣ 3 ⎥⎦ ⎢⎣ 4 ⎥⎦ ⎢⎣ −1 3 4 ⎥⎦

⎡ −0.7 0.2 0.3 ⎤
X −1 = ⎢⎢ −1.3 −0.2 0.7 ⎥⎥ .

⎢⎣ 0.8 0.2 −0.2 ⎥⎦


Calculating AX and multiplying by X−1 from the left, we
thus obtain

⎡ −0.7 0.2 0.3 ⎤ ⎡ −3 −4 0 ⎤ ⎡ 3 0 0 ⎤
D X −1
AX ⎢ −1.3 −0.2 0.7 ⎥⎢ 9 4 0 ⎥ = ⎢0 −4 0 ⎥ .
= = ⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢⎣ 0.8 0.2 −0.2 ⎥⎦ ⎢⎣ −3 −12 0 ⎥⎦ ⎢⎣0 0 0 ⎥⎦


Quadratic Forms
Transformation to Principle Axes
By definition, a quadratic form Q in the components x1, … , xn
of a vector x is a sum n2 of terms, namely,
n n
Q = x T Ax = ∑∑ a jk x j xk
j=1 k=1
= a11x12 + a12 x1x2 +!+ a1n x1xn

(7) +a21x2 x1 + a22 x22 +!+ a2n x2 xn

+!!!!!!!!!!!!
+an1xn x1 + an2 xn x2 +!+ ann xn2 .
A = [ajk] is called the coefficient matrix of the form. We may
assume that A is symmetric, because we can take off-diagonal
terms together in pairs and write the result as a sum of two
equal terms; see the following example.
Quadratic Form
Symmetric Coefficient Matrix
Let ⎡ 3 4 ⎤ ⎡ x1 ⎤
x Ax = ⎡⎣ x1 x2 ⎤⎦ ⎢
T
⎥ ⎢x ⎥
⎣ 6 2 ⎦⎣ 2⎦
2 2
= 3 x 1
+ 4 x x
1 2
+ 6 x x
2 1
+ 2 x 2
2 2
Here 4 + 6 = 10 = 5 + 5.
= 3 x 1
+ 10 x x
1 2
+ 2 x 2
.
From the corresponding symmetric matrix C = [cjk] where cjk = (ajk + akj), thus
c11 = 3, c12 = c21 = 5, c22 = 2, we get the same result; indeed,

⎡ 3 5 ⎤ ⎡ x1 ⎤
x Cx = ⎡⎣ x1 x2 ⎤⎦ ⎢
T
⎥ ⎢x ⎥
⎣ 5 2 ⎦⎣ 2⎦
= 3x 2 + 5x x + 5x x + 2 x 2
1 1 2 2 1 2
= 3 x12 + 10 x1 x2 + 2 x .
2
2

Quadratic Form
Symmetric Coefficient Matrix
A symmetric matrix A of ( example 7) has an orthonormal
basis of eigenvectors.(vi.vj = 0 if i ≠ j and vi.vj = 1 if i = j)
Hence if we take these as column vectors, we obtain a
matrix X that is orthogonal ( X−1 = XT).
Thus A = XDX−1 = XDXT. Substitution into (7) gives
(8) Q = xTXDXTx.
If we set XTx = y, then, since X−1 = XT, we have X−1x = y and
thus obtain
(9) x = Xy.
Furthermore, in (8) we have xTX = (XTx)T = yT and XTx = y,
so that Q becomes simply
(10) Q = yTDy = λ1y12 + λ2y22 + … + λnyn2.

Principal Axes Theorem
Theorem 5
Principal Axes Theorem
The substitution (9) transforms a quadratic form

n n
Q = x Ax = ∑∑ a jk x j xk
T
( akj = a jk )
j =1 k =1
to the principal axes form or canonical form (10), where λ1, … ,

λn are the (not necessarily distinct) eigenvalues of the
(symmetric!) matrix A, and X is an orthogonal matrix with
corresponding eigenvectors x1, … , xn, respectively, as column
vectors.

Transformation to Principal Axes
Conic Sections
Find out what type of conic section the following quadratic form represents
and transform it to principal axes:

Q = 17 x12 − 30x1x2 + 17 x2 2 = 128.

Solution. We have Q = xTAx, where

⎡ 17 −15 ⎤ ⎡ x1 ⎤
A=⎢ ⎥ , x = ⎢ ⎥.
⎣ −15 17 ⎦ ⎣ x2 ⎦

Conic Sections
This gives the characteristic equation (17 − λ)2 − 152 = 0. It
has the roots λ1 = 2, λ2 = 32. Hence (10) becomes
2 2
Q = 2 y1
+ 32 y 2
.
We see that Q = 128 represents the ellipse 2y12 + 32y22 = 128,
that is,

y12 y2 2
2
+ 2 = 1.
8 2


Conic Sections
If we want to know the direction of the principal axes in the
x1x2-coordinates, we have to determine normalized
eigenvectors from (A − λI)x = 0 with λ = λ1 = 2 and
λ = λ2 = 32 and then use (9). We get
⎡ ⎤ ⎡ ⎤
⎢ 1 / 2 ⎥ and ⎢ −1 / 2 ⎥
⎢⎣ 1 / 2 ⎥⎦ ⎢⎣ 1 / 2 ⎥⎦

Observe that the above vectors are orthonormal

Conic Sections
hence
⎡1 / 2 −1 / 2 ⎤ ⎡ y1 ⎤ x1 = y1 / 2 − y2 / 2
x = Xy = ⎢ ⎥⎢ ⎥,
⎢⎣1 / 2 1 / 2 ⎥⎦ ⎣ y2 ⎦ x2 = y1 / 2 + y2 / 2.

This is a 45° rotation.

Lect 3

Uploaded by

Copyright:

Available Formats

Lect 3

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lect 3

Uploaded by

Copyright:

Available Formats

Mathematical Foundations for Data

DSECL ZC416, MFDS

• Eigenvalues and eigenvectors

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

The set of all the eigenvalues of A is called the spectrum of

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

A − λI is called the characteristic matrix and D(λ) the

BITS Pilani, Pilani Campus

The eigenvalues must be determined first.

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

Example 2: Find the eigenvalues and eigenvectors of

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

We get Gerschgorin disks D1: Centre 0, radius 1

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

Similar Matrices. Similarity Transformation

An n × n matrix Â is called similar to an n × n matrix A if

If Â is similar to A, then Â has the same eigenvalues as A.

BITS Pilani, Pilani Campus

If an n × n matrix A has a basis of eigenvectors, then

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

= a11x12 + a12 x1x2 +!+ a1n x1xn

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

The substitution (9) transforms a quadratic form

to the principal axes form or canonical form (10), where λ1, … ,

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

•  Eigenvalues and eigenvectors