Lect 3

Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

Mathematical Foundations for Data

Science
BITS Pilani MFDS Team
Pilani Campus
BITS Pilani
Pilani Campus

DSECL ZC416, MFDS

Lecture No. 3
Agenda

•  Eigenvalues and eigenvectors

•  Gerschgorin’s theorem

•  Similarity transformation

•  Diagonalization of matrices

•  Quadratic forms

BITS Pilani, Pilani Campus


Eigenvalue Problem
A matrix eigenvalue problem considers the vector equation
(1) Ax = λx,
where A is a given square matrix, λ an unknown scalar(real
or complex), and x an unknown vector.
The task is to determine λ’s and x’s (dependent on λ’s)that
satisfy (1).
Since x = 0 is always a solution for any λ,we only admit
solutions with x ≠ 0.
The solutions to (1) are given the following names: The λ’s
that satisfy (1) are called eigenvalues of A and the
corresponding nonzero x’s that also satisfy (1) are called
eigenvectors of A.

BITS Pilani, Pilani Campus


Spectrum

The set of all the eigenvalues of A is called the spectrum of


A. We shall see that the spectrum consists of at least one
eigenvalue and at most of n numerically different
eigenvalues.


The largest of the absolute values of the eigenvalues of A is
called the spectral radius of A, a name to be motivated later.

BITS Pilani, Pilani Campus


Determination of Eigen Value
and Eigen Vector
⎡ −5 2 ⎤
A=⎢ ⎥ .
⎣ 2 −2 ⎦
Solution.
(a) Eigenvalues. These must be determined first.
Equation (1) is in components
⎡ −5 2 ⎤ ⎡ x1 ⎤ ⎡ x1 ⎤
Ax = ⎢ ⎥ ⎢ ⎥ = λ ⎢ ⎥;
⎣ 2 −2 ⎦ ⎣ x2 ⎦ ⎣ x2 ⎦


−5 x1 + 2 x2 = λ x1
2x − 2x = λ x .
1 2 2

BITS Pilani, Pilani Campus


Example

Solution. (continued 1)
(a) Eigenvalues. (continued 1)
Transferring the terms on the right to the left, we get
( −5 − λ ) x1 + 2 x2 = 0
(2*)
2 x1 + ( −2 − λ ) x2 = 0
This can be written in matrix notation
(3*) (A − λI)x = 0
Because (1) is Ax − λx = Ax − λIx = (A − λI)x = 0,
which gives (3*).

BITS Pilani, Pilani Campus


Example

Solution. (continued 2)
(a) Eigenvalues. (continued 2)
We see that this is a homogeneous linear system. By
Cramer’s theorem in Sec. 7.7 it has a nontrivial solution (an
eigenvector of A we are looking for) if and only if its
coefficient determinant is zero, that is,

−5 − λ 2
D(λ ) = det( A − λI) =
2 −2 − λ
(4*)
= ( −5 − λ )( −2 − λ ) − 4 = λ 2 + 7 λ + 6 = 0.

BITS Pilani, Pilani Campus


Example

Solution. (continued 3)
(a) Eigenvalues. (continued 3)
We call D(λ) the characteristic determinant or, if expanded,
the characteristic polynomial, and D(λ) = 0 the
characteristic equation of A. The solutions of this quadratic
equation are λ1 = −1 and λ2 = −6. These are the eigenvalues
of A.
(b1) Eigenvector of A corresponding to λ1. This vector is
obtained from (2*) with λ = λ1 = −1, that is,
−4 x1 + 2 x2 = 0

2 x1 − x2 = 0.

BITS Pilani, Pilani Campus


Example

Solution. (continued 4)
(b1) Eigenvector of A corresponding to λ1. (continued)
A solution is x2 = 2x1, as we see from either of the two
equations, so that we need only one of them. This
determines an eigenvector corresponding to λ1 = −1 up to a
scalar multiple. If we choose x1 = 1, we obtain the
eigenvector
⎡ ⎤ ⎡ −5 2 ⎤⎡ 1 ⎤ ⎡ −1 ⎤
v = ⎢ 1
⎥ , Check: Av = ⎢ ⎥⎢ ⎥=⎢ ⎥ = (−1)v = λ1v.
⎣ 2 ⎦ ⎣ 2 −2 ⎦⎣ 2 ⎦ ⎣ −2 ⎦

BITS Pilani, Pilani Campus


Example

Solution. (continued 5)
(b2) Eigenvector of A corresponding to λ2.
For λ = λ2 = −6, equation (2*) becomes
x1 + 2 x2 = 0
2 x1 + 4 x2 = 0.
A solution is x2 = −x1/2 with arbitrary x1. If we choose x1 = 2,
we get x2 = −1. Thus an eigenvector of A corresponding to
λ2 = −6 is
⎡ 2 ⎤ ⎡ −5 2 ⎤⎡ 2 ⎤ ⎡ −12 ⎤
w = ⎢ ⎥ , Check: Aw = ⎢ ⎥⎢ ⎥=⎢ ⎥ = (−6)w = λ2 w.
⎣ −1 ⎦ ⎣ 2 −2 ⎦⎣ −1 ⎦ ⎣ 6 ⎦

BITS Pilani, Pilani Campus


Eigen Value Analysis
This example illustrates the general case as follows. Equation (1) written in
components is a11x1 +!+ a1n xn = λ x1

a21x1 +!+ a2n xn = λ x2

!!!!!!!!


an1x1 +!+ ann xn = λ xn .
Transferring the terms on the right side to the left side, we have


(2) (a11 − λ )x1 + a12 x2 +!+ a1n xn = 0

a21x1 + (a22 − λ )x2 +!+ a2n xn = 0
!!!!!!!!!!!!!!!
an1x1 + an2 x2 +!+ (ann − λ )xn = 0.

BITS Pilani, Pilani Campus


Eigen Value Analysis
In matrix notation,
(3) ( A − λI)x = 0.
By Cramer’s theorem in Sec. 7.7, this homogeneous linear
system of equations has a nontrivial solution if and only if
the corresponding determinant of the coefficients is zero:

a11 − λ a12 ! a1n
(4) a21 a22 − λ ! a2n
D( λ ) = det(A − λI) = = 0.

⋅ ⋅ ! ⋅

an1 an2 ! ann − λ

BITS Pilani, Pilani Campus


Eigen Value Analysis

A − λI is called the characteristic matrix and D(λ) the


characteristic determinant of A. Equation (4) is called the
characteristic equation of A. By developing D(λ) we obtain
a polynomial of nth degree in λ. This is called the
characteristic polynomial of A.

BITS Pilani, Pilani Campus


Eigen Values

Theorem 1
Eigenvalues
The eigenvalues of a square matrix A are the roots of the
characteristic equation (4) of A.
Hence an n × n matrix has at least one eigenvalue and at most n
numerically different eigenvalues.

The eigenvalues must be determined first.


Once these are known, corresponding eigenvectors are
obtained from the system (2), for instance, by the Gauss
elimination, where λ is the eigenvalue for which an
eigenvector is wanted.

BITS Pilani, Pilani Campus


Eigen Space

Theorem 2
Eigenvectors, Eigenspace
If w and x are eigenvectors of a matrix A corresponding to the
same eigenvalue λ, so are w + x (provided x ≠ −w) and kx for
any k ≠ 0.
Hence the eigenvectors corresponding to one and the same
eigenvalue λ of A, together with 0, form a vector space called the
eigenspace of A corresponding to that λ.

BITS Pilani, Pilani Campus


Multiple Eigen Values

Example 2: Find the eigenvalues and eigenvectors of


⎡ −2 2 −3 ⎤
A = ⎢⎢ 2 1 −6 ⎥⎥ .

⎢⎣ −1 −2 0 ⎥⎦


Solution.
For our matrix, the characteristic determinant gives the
characteristic equation
−λ3 − λ2 + 21λ + 45 = 0.
The roots (eigenvalues of A) are λ1 = 5, λ2 = λ3 = −3.

BITS Pilani, Pilani Campus


Multiple Eigen Values

Solution. (continued 1)
To find eigenvectors, we apply the Gauss elimination (Sec.
7.3) to the system (A − λI)x = 0, first with λ = 5
and then with λ = −3 . For λ = 5 the characteristic matrix is
⎡ −7 2 −3 ⎤
A − λI = A − 5I = ⎢⎢ 2 −4 −6 ⎥⎥ .

⎢⎣ −1 −2 −5 ⎥⎦

It row-reduces to ⎡ −7 2 −3 ⎤
⎢ 0 −24 / 7 −48 / 7 ⎥ .
⎢ ⎥
⎢⎣ 0 0 0 ⎥⎦

BITS Pilani, Pilani Campus


Multiple Eigen Values

Solution. (continued 2)
Hence it has rank 2. Choosing x3 = −1 we have x2 = 2 from
24 48
− x
2
− x 3 = 0 and then x1 = 1 from −7x1 + 2x2 − 3x3 = 0.
7 7
Hence an eigenvector of A corresponding to λ = 5 is
x1 = [1 2 −1]T.
For λ = −3 the characteristic matrix
⎡ 1 2 −3 ⎤
A − λI = A + 3I = ⎢ 2 4 −6 ⎥
⎢ ⎥
⎡ 1 2 −3 ⎤

⎣ − 1 −2 3 ⎥
⎦ ⎢0 0 0 ⎥ .
row-reduces to ⎢ ⎥
⎢⎣0 0 0 ⎥⎦

BITS Pilani, Pilani Campus


Multiple Eigen Values

Solution. (continued 3)
Hence it has rank 1.
From x1 + 2x2 − 3x3 = 0 we have x1 = −2x2 + 3x3. Choosing
x2 = 1, x3 = 0 and x2 = 0, x3 = 1, we obtain two linearly
independent eigenvectors of A corresponding to λ = −3 [as
they must exist by (5), Sec. 7.5, with rank = 1 and n = 3],
⎡ −2 ⎤ ⎡3⎤
x 2 = ⎢⎢ 1 ⎥⎥ and x 3 = ⎢⎢0 ⎥⎥ .
⎢⎣ 0 ⎥⎦ ⎢⎣ 1 ⎥⎦

BITS Pilani, Pilani Campus


Gerschgorin’s Theorem
Theorem gives the bound on Eigenvalues
Every eigenvalue of matrix Anxn satisfies :
λ – {aii} ≤ Σ |{aij}|, i = 1,2,....n
j≠ i ⎡
⎢ 0
1 1 ⎤

2 2 ⎥
Example : A = ⎢ 1


⎢ 5 1 ⎥
⎢ 2
⎢ 1


⎢ 1 1 ⎥
⎣ 2 ⎦

We get Gerschgorin disks D1: Centre 0, radius 1


D2 : Centre 5, radius 1.5
D3 : Centre 1, radius 1.5
The centers are main diagonal entries of A. These would
be the eigenvalues of A if A were diagonal
BITS Pilani, Pilani Campus
Algebraic Multiplicity
&Geometric Multiplicity
The order Mλ of an eigenvalue λ as a root of the
characteristic polynomial is called the algebraic
multiplicity of λ. The number mλ of linearly independent
eigenvectors corresponding to λ is called the geometric
multiplicity of λ. Thus mλ is the dimension of the
eigenspace corresponding to this λ.
Since the characteristic polynomial has degree n, the sum
of all the algebraic multiplicities must equal n. In Example 2
for λ = −3 we have mλ = Mλ = 2. In general, mλ ≤ Mλ, as can
be shown. The difference Δλ = Mλ − mλ is called the defect of
λ. Thus Δ−3 = 0 in Example 2, but positive defects Δλ can
easily occur.

BITS Pilani, Pilani Campus


Special cases
Theorem 3
Eigenvalues of the Transpose
The transpose AT of a square matrix A has the same eigenvalues
as A.

Basis of Eigenvectors

If an n × n matrix A has n distinct eigenvalues, then A has a basis
of eigenvectors x1, … , xn for Rn.

BITS Pilani, Pilani Campus


Eigenvectors corresponding to Distinct
Eigenvalues are Linearly Independent
Let k be the smallest positive integer such that v1, v2, . . . , vk are linearly
independent. If k = p, nothing is to be proved.
If k < p, then vk+1 is a linear combination of v1, . . . , vk; that is, there exist constants
c1,c2,...,ck such that
vk+1 =c1v1 +c2v2 +···+ckvk.
Applying the matrix A to both sides, we have
Avk+1 = λk+1 vk+1
= λk+1 (c1v1 +c2v2 +···+ckvk)
= c1λk+1v1 + c2λk+1v2 + · · · + ckλk+1vk;
Avk+1 = A(c1v1 +c2v2 +···+ckvk)
= c1Av1 +c2Av2 +···+ckAvk
= c1λ1v1 +c2λ2v2 +···+ckλkvk.
Thus c1(λk+1 −λ1)v1 +c2(λk+1 −λ2)v2 +···+ck(λk+1 −λk)vk =0.
Since v1,v2,...,vk are linearly independent, we have
c1(λk+1 −λ1) = c2(λk+1 −λ2)=···= ck(λk+1 −λk)=0.
Note that the eigenvalues are distinct. Hence
c1 =c2 =···=ck =0,
which implies that vk+1 is the zero vector. This is contradictory to vk+1 ≠ 0.

BITS Pilani, Pilani Campus


Similarity of Matrices

Similar Matrices. Similarity Transformation

An n × n matrix  is called similar to an n × n matrix A if


(4) Â = P−1AP
for some (nonsingular!) n × n matrix P. This transformation,
which gives  from A, is called a similarity transformation.
Eigenvalues and Eigenvectors of Similar Matrices

If  is similar to A, then  has the same eigenvalues as A.


Furthermore, if x is an eigenvector of A, then y = P−1x is an
eigenvector of  corresponding to the same eigenvalue.

BITS Pilani, Pilani Campus


Diagonalization of a Matrix

Diagonalization of a Matrix

If an n × n matrix A has a basis of eigenvectors, then


(5) D = X−1AX
is diagonal, with the eigenvalues of A as the entries on the main
diagonal. Here X is the matrix with these eigenvectors as column
vectors. Also,
(5*) Dm = X−1AmX (m = 2, 3, … ).

BITS Pilani, Pilani Campus


Diagonalization of a Matrix

Diagonalize

⎡ 7.3 0.2 −3.7 ⎤
A = ⎢⎢ −11.5 1.0 5.5 ⎥⎥ .
⎢⎣ 17.7 1.8 −9.3 ⎥⎦

Solution.
The characteristic determinant gives the characteristic
equation −λ3 −λ2 + 12λ = 0. The roots (eigenvalues of A)
are λ1 = 3, λ2 = −4, λ3 = 0. By the Gauss elimination applied
to (A − λI)x = 0 with λ = λ1, λ2, λ3 we find eigenvectors and
then X−1 by the Gauss–Jordan elimination

BITS Pilani, Pilani Campus


Diagonalization of a Matrix

Solution. (continued 1)
The results are
⎡ −1⎤ ⎡ 1 ⎤ ⎡2⎤ ⎡ −1 1 2⎤
⎢ 3 ⎥ , ⎢ −1 ⎥ , ⎢1⎥ ,
⎢ ⎥ ⎢ ⎥ ⎢ ⎥ X = ⎢⎢ 3 −1 1 ⎥⎥ ,

⎢⎣ −1⎥⎦ ⎢⎣ 3 ⎥⎦ ⎢⎣ 4 ⎥⎦ ⎢⎣ −1 3 4 ⎥⎦



⎡ −0.7 0.2 0.3 ⎤
X −1 = ⎢⎢ −1.3 −0.2 0.7 ⎥⎥ .

⎢⎣ 0.8 0.2 −0.2 ⎥⎦



BITS Pilani, Pilani Campus


Diagonalization of a Matrix

Solution. (continued 2)
Calculating AX and multiplying by X−1 from the left, we
thus obtain

⎡ −0.7 0.2 0.3 ⎤ ⎡ −3 −4 0 ⎤ ⎡ 3 0 0 ⎤
D X −1
AX ⎢ −1.3 −0.2 0.7 ⎥⎢ 9 4 0 ⎥ = ⎢0 −4 0 ⎥ .
= = ⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢⎣ 0.8 0.2 −0.2 ⎥⎦ ⎢⎣ −3 −12 0 ⎥⎦ ⎢⎣0 0 0 ⎥⎦

BITS Pilani, Pilani Campus


Quadratic Forms
Transformation to Principle Axes
By definition, a quadratic form Q in the components x1, … , xn
of a vector x is a sum n2 of terms, namely,
n n

Q = x T Ax = ∑∑ a jk x j xk
j=1 k=1

= a11x12 + a12 x1x2 +!+ a1n x1xn


(7) +a21x2 x1 + a22 x22 +!+ a2n x2 xn

+!!!!!!!!!!!!
+an1xn x1 + an2 xn x2 +!+ ann xn2 .
A = [ajk] is called the coefficient matrix of the form. We may
assume that A is symmetric, because we can take off-diagonal
terms together in pairs and write the result as a sum of two
equal terms; see the following example.
BITS Pilani, Pilani Campus
Quadratic Form
Symmetric Coefficient Matrix
Let ⎡ 3 4 ⎤ ⎡ x1 ⎤
x Ax = ⎡⎣ x1 x2 ⎤⎦ ⎢
T
⎥ ⎢x ⎥
⎣ 6 2 ⎦⎣ 2⎦
2 2
= 3 x 1
+ 4 x x
1 2
+ 6 x x
2 1
+ 2 x 2
2 2
Here 4 + 6 = 10 = 5 + 5.
= 3 x 1
+ 10 x x
1 2
+ 2 x 2
.
From the corresponding symmetric matrix C = [cjk] where cjk = (ajk + akj), thus
c11 = 3, c12 = c21 = 5, c22 = 2, we get the same result; indeed,


⎡ 3 5 ⎤ ⎡ x1 ⎤
x Cx = ⎡⎣ x1 x2 ⎤⎦ ⎢
T
⎥ ⎢x ⎥
⎣ 5 2 ⎦⎣ 2⎦
= 3x 2 + 5x x + 5x x + 2 x 2
1 1 2 2 1 2

= 3 x12 + 10 x1 x2 + 2 x .
2
2

BITS Pilani, Pilani Campus


Quadratic Form
Symmetric Coefficient Matrix
A symmetric matrix A of ( example 7) has an orthonormal
basis of eigenvectors.(vi.vj = 0 if i ≠ j and vi.vj = 1 if i = j)
Hence if we take these as column vectors, we obtain a
matrix X that is orthogonal ( X−1 = XT).
Thus A = XDX−1 = XDXT. Substitution into (7) gives
(8) Q = xTXDXTx.
If we set XTx = y, then, since X−1 = XT, we have X−1x = y and
thus obtain
(9) x = Xy.
Furthermore, in (8) we have xTX = (XTx)T = yT and XTx = y,
so that Q becomes simply
(10) Q = yTDy = λ1y12 + λ2y22 + … + λnyn2.

BITS Pilani, Pilani Campus


Principal Axes Theorem

Theorem 5
Principal Axes Theorem

The substitution (9) transforms a quadratic form


n n
Q = x Ax = ∑∑ a jk x j xk
T
( akj = a jk )
j =1 k =1

to the principal axes form or canonical form (10), where λ1, … ,


λn are the (not necessarily distinct) eigenvalues of the
(symmetric!) matrix A, and X is an orthogonal matrix with
corresponding eigenvectors x1, … , xn, respectively, as column
vectors.

BITS Pilani, Pilani Campus


Transformation to Principal Axes
Conic Sections

Find out what type of conic section the following quadratic form represents
and transform it to principal axes:

Q = 17 x12 − 30x1x2 + 17 x2 2 = 128.

Solution. We have Q = xTAx, where


⎡ 17 −15 ⎤ ⎡ x1 ⎤
A=⎢ ⎥ , x = ⎢ ⎥.
⎣ −15 17 ⎦ ⎣ x2 ⎦

BITS Pilani, Pilani Campus


Transformation to Principal Axes
Conic Sections

Solution. (continued 1)
This gives the characteristic equation (17 − λ)2 − 152 = 0. It
has the roots λ1 = 2, λ2 = 32. Hence (10) becomes
2 2
Q = 2 y1
+ 32 y 2
.
We see that Q = 128 represents the ellipse 2y12 + 32y22 = 128,
that is,

y12 y2 2
2
+ 2 = 1.
8 2

BITS Pilani, Pilani Campus


Transformation to Principal Axes
Conic Sections

Solution. (continued 2)
If we want to know the direction of the principal axes in the
x1x2-coordinates, we have to determine normalized
eigenvectors from (A − λI)x = 0 with λ = λ1 = 2 and
λ = λ2 = 32 and then use (9). We get
⎡ ⎤ ⎡ ⎤
⎢ 1 / 2 ⎥ and ⎢ −1 / 2 ⎥
⎢⎣ 1 / 2 ⎥⎦ ⎢⎣ 1 / 2 ⎥⎦

Observe that the above vectors are orthonormal

BITS Pilani, Pilani Campus


Transformation to Principal Axes
Conic Sections
Solution. (continued 3)
hence
⎡1 / 2 −1 / 2 ⎤ ⎡ y1 ⎤ x1 = y1 / 2 − y2 / 2
x = Xy = ⎢ ⎥⎢ ⎥,
⎢⎣1 / 2 1 / 2 ⎥⎦ ⎣ y2 ⎦ x2 = y1 / 2 + y2 / 2.

This is a 45° rotation.

BITS Pilani, Pilani Campus

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy