0% found this document useful (0 votes)
4 views

Lecture 04

The document discusses numerical linear algebra, focusing on solving linear systems using LU factorization, which involves decomposing a matrix A into lower (L) and upper (U) triangular matrices. It outlines the steps for solving the equation Ax = b through forward and backward sweeps, as well as algorithms for these processes. Additionally, it covers concepts related to eigenvalues and eigenvectors, including the power method for finding the dominant eigenvalue and associated eigenvector.

Uploaded by

dsmlab986
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Lecture 04

The document discusses numerical linear algebra, focusing on solving linear systems using LU factorization, which involves decomposing a matrix A into lower (L) and upper (U) triangular matrices. It outlines the steps for solving the equation Ax = b through forward and backward sweeps, as well as algorithms for these processes. Additionally, it covers concepts related to eigenvalues and eigenvectors, including the power method for finding the dominant eigenvalue and associated eigenvector.

Uploaded by

dsmlab986
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Outline

1 Numerical Linear Algebra


LU Factorization
Solving Ax = b

Solving Ax = b is central to scientific computing. It is also needed in:


• Kernel Ridge Regression
• Second order optimization methods
Steps to solve Ax = b:
• Factor the given matrix as follows:

A = LU,

where L is lower and U is upper triangular matrices.


• Solve Ax = b by solving LU x = b in following steps:
1 Solve Ly = b called forward sweep
2 Solve U x = y called backward sweep
Forward and Backward Sweeps

Forward Sweep:
ℓ11 0 ··· 0 y  b 
 
1 1
..  y
 ℓ21

ℓ22 ··· .  2  b2 
 .  . =.
 . ..
.
  ..   .. 
. ··· 0
ℓn1 ··· ··· ℓ yn bn
nn
Forward and Backward Sweeps

Backward Sweep:
u11 u12 ··· u1n  x   y 
 
1 1
..  x
 0

u22 ··· .  2  y2 
 .  . = . 
 . .. ..   ..   .. 
. ··· . .
0 ··· 0 unn xn yn
Algorithms for Forward and Backward Sweeps

Algorithm for forward sweep:

Algorithm for backward sweep:


Algebra of Triangular Matrices

1 Inverse of an upper (lower) triangular matrix is upper (lower) triangular


2 Product of two upper (lower) triangular matrices is upper (lower) triangular
3 Inverse of an unit upper (lower) triangular matrix is a unit upper (lower)
triangular
4 Product of two unit upper (lower) triangular matrices is a unit upper
(lower) triangular
Solving simultaneous linear systems: Algebraic way

Find x1 and x2 such that

3x1 + 5x2 = 9
6x1 + 7x2 = 4
Elementary Row Operations
1 Row switching: Ri ↔ Rj

2 Row multiplication: Ri ← kRi

3 Row addition: Ri ← Ri + kRj


Gauss Transforms

What is τ ?
    
1 0 v1 v
= 1
−τ 1 v2 0

More generally, which matrix to multiply on the left to create zeros below vk ?
   
v1 v1
 ..   .. 
 .  .
   
 vk  vk 
vk+1  =  0 
   
 .  .
   
 ..   .. 
vn 0
Gauss Transforms

Suppose v ∈ Rn with vk ̸= 0. If
vi
τ T = [0, . . . , 0, τk+1 , . . . , τn ], τi = , i = k + 1 : n,
vk

Define: Mk = In − τ eTk , then


    
1 ··· 0 0 ··· 0 v1 v1
 .. .. .. .. ..   ..   .. 
.
 . . . .  .   . 
   
0 1 0 0   vk 
 
 = vk 
 
Mk v =  0 −τk+1 0 0 vk+1  
  
0

. .. .. .. ..   ..   .. 

 .. ..
. . . . .  .   . 
0 ··· −τn 0 ··· 1 vn 0
Upper Triangularizing a Matrix

 
1 4 7
A = 2 5 8
3 6 10

1 Make zeros below the diagonal of 1st column:


   
1 0 0 1 4 7
M1 = −2 1 0 =⇒ M1 A = 0 −3 −6 
−3 0 1 0 −6 −11

2 Make zeros below the diagonal of the above matrix


   
1 0 0 1 4 7
M2 = 0 1 0 =⇒ M2 M1 A = 0 −3 −6
0 −2 1 0 0 1
Remarks on upper triangularization

1 At the start of the kth loop we have a matrix

A(k−1) = Mk−1 · · · M1 A

that is upper triangular in columns 1 through k − 1


2 The multipliers in the kth Gauss transform Mk are based on
(k−1)
A(k−1) (k + 1 : n, k) and akk must be non-zero in order to proceed
Solving simultaneous linear systems: Matrix view

    
3 5 x1 9
=
6 7 x2 4

Idea: Keep making zeros below the main diagonal. Then matrix becomes upper
triangular, which can be solved using backward sweep.
Existence of LU factorization

If no zero pivots are encountered, then Gauss transforms M1 , . . . , Mn−1 are


generated such that
Mn−1 · · · M1 A = U,
is upper triangular.
If Mk = In − τ (k) eTk , then Mk−1 = In + τ (k) eTk , and

A = LU,

where
L = M1−1 · · · Mn−1
−1
LU Factorization

Theorem
If A ∈ Rn×n and det(A(1 : k, 1 : k)) ̸= 0 for k = 1 : n − 1, then there exists a unit
lower trianguar L ∈ Rn×n and an upper triangular U ∈ Rn×n such that A = LU. If
this is the case and A is nonsingular, then the factorization is unique and
det(A) = u11 u22 · · · unn .
Simplify L

• We have

L = M1−1 · · · Mn−1
−1

• Construction of L is not complicated

L = M1−1 · · · Mn−1
−1

= (In − τ (1) eT1 )−1 · · · (In − τ (n−1) eTn−1 )−1


= (In + τ (1) eT1 ) · · · (In + τ (n−1) eTn−1 )

• Here τ k = [0, · · · , 0, τ k+1 , · · · , τ n ]T .


• Have a look at “mix" terms:
τ (i) eTi τ (j) eTj
Simplify L

• Does these “mix" terms:


τ (i) eTi τ (j) eTj
survive? They dont. Exercise!
• We have simplified L
n−1
X
L = In + τ (k) eTk , L(k + 1 : n, k) = τ (k) (k + 1 : n)
k=1
Practical Implementation

1 It is enough to update A(k + 1 : n, k + 1 : n)


2 We can overwrite A(k + 1 : n, k) with L(k + 1 : n, k)
Steps:
LU Algorithm

1: for k = 1 to n − 1 do
2: A(k + 1 : n, k) ← A(k + 1 : n, k)/A(k, k)
3: for i = k + 1 to n do
4: for j = k + 1 to n do
5: A(i, j) ← A(i, j) − A(i, k) · A(k, j)
6: end for
7: end for
8: end for
Vectorize j th loop.
LU Algorithm: After Vectorization of jth loop

1: for k = 1 to n − 1 do
2: A(k + 1 : n, k) ← A(k + 1 : n, k)/A(k, k)
3: for i = k + 1 to n do
4: A(i, k + 1 : n) ← A(i, k + 1 : n) − A(i, k) · A(k, k + 1 : n)
5: end for
6: end for
Eigenvalues and Eigenvectors

A ∈ Rn×n . A vector v ∈ Rn , v ̸= 0 is called an eigenvector of A, if there exists a


λ ∈ C such that

Av = λv

Here λ is called the eigenvalue of A


• The pair (λ, v) is called an eigenpair of A
• Each eigenvector has unique eigenvalue associated with it
• Each eigenvalue is associated with many eigenvectors
• Set of all eigenvalues of A is called the spectrum of A
Facts about Eigenvalues and Eigenvectors

• λ is an eigenvalue of A if and only if

det(λI − A) = 0

• The above equation is called characteristic equation of A


• Useful theoretical device, but of little value for computing eigenvalues
• Not hard to see that det(λI − A) = 0 is a polynomial of degree n
• Here det(λI − A) = 0 is called characteristic polynomial of A
Computing Eigenvalues and Eigenvectors

• Eigenvalue problem and problem of finding root is equivalent


• (Abel) No general formula for the roots of equation of degree > 4
• Hence no general formula for computing eigenvalues for n > 4
Division of numerical methods:
• Direct: Result in finite number of steps. Examples: LU, QR
• Iterative: Produces sequence of approximations towards the required result
Power method and extensions

Assume:
• A ∈ Rn×n
• A is semi-simple: A has n linearly independent eigenvectors, which forms
the basis of Rn
• Eigenvalues are ordered: |λ1 | ≥ |λ2 | ≥ · · · ≥ |λn |
• λ1 is called dominant eigenvalue
Power Method: If A has a dominant eigenvalue, then we can find it and an
associated eigenvector.
Power Method: Basic Idea

Idea: Generate the following sequence

q, Aq, A2 q, · · ·

Claim: The above sequence converges to largest eigenvector of A regardless of


the initial vector q. Why?
Power Method Finds the Largest Eigenvector

Proof: Given a vector q, since v1 , v2 , · · · , vn forma a basis for Rn , there exists


constants c1 , · · · , cn such that

q = c1 v1 + c2 v2 + · · · + cn vn

In general c1 will be non-zero. Multiplying q by A, we have

Aq = c1 Av1 + c2 Av2 + · · · + cn Avn


= c1 λ1 v1 + c2 λ2 v2 + · · · + cn λn vn
A q = c1 λ21 v1 + c2 λ22 v2 + · · · + cn λ2n vn
2

Aj q = c1 λj1 v1 + c2 λj2 v2 + · · · + cn λjn vn


= λj1 (c1 v1 + c2 (λ2 /λ1 )j v2 + · · · + cn (λn /λ1 )j vn )

Second term onwards goes to zero as j → ∞


Power Method Algorithm

Let qj = Aj q/λj1 , then qj → c1 v1 as j → ∞. We have

∥qj − c1 v1 ∥ = ∥c2 (λ2 /λ1 )j v2 + · · · + cn (λn /λ1 )j vn ∥


≤ |c2 ||λ2 /λ1 |j ∥v2 ∥ + · · · + |cn ||λn /λ1 |j ∥vn ∥
≤ (|c2 |∥v2 ∥ + · · · + |cn |∥vn ∥)|λ2 /λ1 |j

Note: We used |λi | ≤ |λ2 ||, i = 3, · · · , n.


Let C = |c2 |∥v2 ∥ + · · · + |cn |∥vn ∥, we have

∥qj − c1 v1 ∥ ≤ C|λ2 /λ1 |j , j = 1, 2, 3, . . .

Clearly, since, |λ1 | > |λ2 |, it follows that

|λ2 /λ1 | → 0 as j → ∞
Algorithm: Power Method

Find largest eigenvector of A


1: Choose a random vector q
2: for i = 1, 2, . . . do
Aqi
3: qi+1 ← ∥Aq i ∥∞
4: if ∥qi+1 − qi ∥ ≤ tol then
5: break
6: end if
7: end for
• Flops: 2n2 assuming A is not a sparse matrix
• Flops for sparse matrices is considerably less
• If A has five non-zero entries per row, then cost of Aqj is only 10n Flops
Inverse Iteration and Shift-and-Invert Strategy

Assumption: Let A ∈ Rn×n is semisimple with linearly independent eigenvectors


v1 , · · · , vn and associated eigenvalues λ1 , · · · , λn , arranged in descending order.

Fact
If A is non-singular, then all the eigenvalues of A are non-zeros. Show that if v is
an eigenvector of A associated with eigenvalue λ, then v is also an eigenvector
of A−1 associated with eigenvalue λ−1 .

Proof in class.
Inverse Iteration

Find smallest eigenvector of A


Key Idea: Smallest eigenvector of A is the largest eigenvector of A−1
1: Choose a random vector q, tolerance tol
2: for i = 1, 2, . . . do
3: qi+1 ← A−1 qi /∥Aqi ∥∞
4: if ∥qi+1 − qi ∥ ≤ tol then
5: break
6: end if
7: end for
• Only change compared to power method is in line 3.
Towards Shift-and-Invert Iteration

Fact
Let ρ ∈ R. Show that v is an eigenvector of A with eigenvalue λ, then v is also an
eigenvector of A − ρI with eigenvalue λ − ρ.

Proof in class.
Shift-and-Invert Idea

• Let λ1 ≥ λ2 ≥ · · · λn be the eigenvalues of A


• A − ρI has eigenvalues λ1 − ρ, λ2 − ρ, . . . , λn − ρ, here ρ is the shift
• To find eigenvector corresponding to eigenvalue λi , choose a shift, such
that the smallest eigenvalue of A − ρI is λi − ρ
• Now apply inverse power method to find the smallest eigenvalue δi = λi − ρ
of A − ρI
• The ith eigenvalue λi of A is δi + ρ
• How to guess ρ?
• Does Greshgorin theorem1 help? What is that?

1 https://en.wikipedia.org/wiki/Gershgorin_circle_theorem
Rayleigh Quotient Iteration

Idea: Use Rayleigh quotient as a shift for the next iteration.


1: Choose a random vector q
2: for i = 1, 2, . . . do
q ∗ Aqi
3: ρi ← i ∗
qi qi
4: Solve (A − ρi I)q̂i+1 = qi
−1
5: qi+1 ← σi+1 q̂i+1
6: if ∥qi+1 − qi ∥ ≤ tol then
7: break
8: end if
9: end for
• σj is a suitable scaling factor

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy