0% found this document useful (0 votes)

52 views

Applied Numerical Linear Algebra. Lecture 5

This document discusses techniques for improving the accuracy of solutions to linear systems Ax = b. It describes iterative refinement, where the solution is repeatedly refined to reduce error. It also discusses equilibration, which preconditions the matrix A to reduce its condition number and thus improve accuracy. High-performance linear algebra software like LAPACK and ScaLAPACK use block algorithms with these techniques to efficiently solve problems on modern parallel computers.

Uploaded by

Sonia Isabel Rentería Alva

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views

Applied Numerical Linear Algebra. Lecture 5

Uploaded by

Sonia Isabel Rentería Alva

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 52

Applied Numerical Linear Algebra.

Lecture 5

1 / 52
Improving the Accuracy of a Solution

We have just seen that the error in solving Ax = b may be as large

as k(A)ε. If this error is too large, what can we do? One
possibility is to rerun the entire computation in higher precision,
but this may be quite expensive in time and space. Fortunately, as
long as k(A) is not too large, there are much cheaper methods
available for getting a more accurate solution.

2 / 52
Improving the Accuracy of a Solution

To solve any equation f (x) = 0, we can try to use Newton’s method to

improve an approximate solution xi to get xi +1 = xi − ff′(x(xii)) . Applying
this to f (x) = Ax − b yields one step of iterative refinement:

r = Axi − b
solve Ad = r for d
xi +1 = xi − d

If we could compute r = Axi − b exactly and solve Ad = r exactly, we

would be done in one step, which is what we expect from Newton applied
to a linear problem. Roundoff error prevents this immediate convergence.
The algorithm is interesting and of use precisely when A is so
ill-conditioned that solving Ad = r (and Ax0 = b) is rather inaccurate.

3 / 52
THEOREM 2.7. Suppose that r is computed in double precision
and k(A) · ε < c ≡ 3n31g +1 < 1 where n is the dimension of A and
g is the pivot growth factor. Then repeated iterative refinement
converges with
||xi − A−1 b||∞
= O(ε).
||A−1 b||∞
Note that the condition number does not appear in the final error
bound. This means that we compute the answer accurately
independent of the condition number, provided that k(A)ε is
sufficiently less than 1. (In practice, c is too conservative an upper
bound, and the algorithm often succeeds even when k(A)ε is
greater than c.)

4 / 52
Sketch of Proof.
Let set here || · ||∞ by || · ||. Our goal is to show that

k(A)ε
||xi +1 − x|| ≤ ||xi − x|| ≡ ζ||xi − x||.
c

By assumption, ζ < 1, so this inequality implies that the error ||xi +1 − x||
decreases monotonically to zero. (In practice it will not decrease all the
way to zero because of rounding error in the assignment xi +1 = xi − d,
which we are ignoring.)
We begin by estimating the error in the computed residual r . We get
r = fl (Axi − b) = Axi − b + f , where
|f | ≤ nε2 (|A| · |xi | + |b|) + ε|Axi − b| ≈ ε|Axi − b|. The ε2 term comes
from the double precision computation of r , and the ε term comes from
rounding the double precision result back to single precision. Since
ε2 ≪ ε, we will neglect the ε2 term in the bound on |f |.
Next we get (A + δA)d = r , where from bound (2.11) we know that
||δA|| ≤ γ · ε · ||A||, where γ = 3n3 g , although this is usually much too
large. As mentioned earlier, we simplify matters by assuming
xi +1 = xi − d exactly.
5 / 52
Continuing to ignore all ε2 terms, we get

d = (A + δA)−1 r = (I + A−1 δA)−1 A−1 r

= (I + A−1 δA)−1 A−1 (Axi − b + f )
= (I + A−1 δA)−1 (xi − x + A−1 f )
≈ (I − A−1 δA)(xi − x + A−1 f )
≈ xi − x − A−1 δA(xi − x) + A−1 f .

6 / 52
Therefore xi +1 − x = xi − d − x = A−1 δA(xi − x) − A−1 f and so

||xi +1 − x|| ≤||A−1 δA(xi − x)|| + ||A−1 f ||

≤||A−1 || · ||δA|| · ||xi − x|| + ||A−1 || · ε · ||Axi − b||
≤||A−1 || · ||δA|| · ||xi − x|| + ||A−1 || · ε · ||A(xi − x)||
≤||A−1 || · γε · ||A|| · ||xi − x||
+||A−1 || · ||A|| · ε · ||xi − x||
= ||A−1 || · ||A|| · ε · (γ + 1) · ||xi − x||,

so if
ζ = ||A−1 || · ||A|| · ε(γ + 1) = k(A)ε/c < 1,
then we have convergence.

7 / 52
Single Precision Iterative Refinement

THEOREM 2.8. Suppose that r is computed in single precision and

maxi (|A| · |x|)i

||A−1 ||∞ · ||A||∞ · · ε < 1.
mini (|A| · |x|)i

Then one step of iterative refinement yields x1 such that

(A + δA)x1 = b + δb with |δaij | = O(ε)|aij | and |δbi | = O(ε)|bi |. In
other words, the componentwise relative backward error is as small as
possible. For example, this means that if A and b are sparse, then δA and
δb have the same sparsity structures as A and b, respectively.

8 / 52
For a proof, see
N. J. Higham. Accuracy and Stability of Numerical Algorithms.
SIAM, Philadelphia, PA, 1996.
M. Arioli, J. Demmel, and I. S. Duff. Solving sparse linear systems
with sparse backward error. SIAM J. Matrix Anal. AppL,
10:165-190, 1989.
R. D. Skeel. Scaling for numerical stability in Gaussian elimination.
Journal of the ACM, 26:494-526, 1979.
R. D. Skeel. Iterative refinement implies numerical stability for
Gaussian elimination. Math. Comp., 35:817-832, 1980.
R. D. Skeel. Effect of equilibration on residual size for partial
pivoting. SIAM J. Numer. Anal, 18:449-454, 1981.
Single precision iterative refinement and the error bound (2.14) are
implemented in LAPACK routines like sgesvx.

9 / 52
Equilibration

There is one more common technique for improving the error in

solving a linear system: equilibration. This refers to choosing an
appropriate diagonal matrix D and solving DAx = Db instead of
Ax = b. D is chosen to try to make the condition number of DA
smaller than that of A. For instance, choosing dii to be the
reciprocal of the two-norm of row i of A would make DA nearly
equal to the identity matrix, reducing its condition number from
1014 to 1. It is possible to show that choosing D this way reduces
√
the condition number of DA to within a factor of n of its smallest
possible value for any diagonal D [A. Van Der Sluis. Condition
numbers and equilibration of matrices. Numer. Math., 14:14-23,
1969]. In practice we may also choose two diagonal matrices Drow
and Dcol and solve (Drow ADcol )x̄ = Drow b, x = Dcol x̄.
The techniques of iterative refinement and equilibration are
implemented in the LAPACK subroutines like sgerfs and sgeequ,
respectively. These are in turn used by driver routines like sgesvx.
10 / 52
Blocking Algorithms for Higher Performance

Changing the order of the three nested loops in the

implementation of Gaussian elimination in Algorithm 2.2 could
change the execution speed by orders of magnitude, depending on
the computer and the problem being solved. In this section we will
explore why this is the case and describe some carefully written
linear algebra software which takes these matters into account.
These implementations use so-called block algorithms, because
they operate on square or rectangular subblocks of matrices in
their innermost loops rather than on entire rows or columns. These
codes are available in public-domain software libraries such as
LAPACK (in Fortran, at NETLIB/lapack) and ScaLAPACK (at
NETLIB/scalapack). LAPACK (and its versions in other
languages) are suitable for PCs, workstations, vector computers,
and shared-memory parallel computers.

11 / 52
These include the Sun SPARC-center 2000 [SPARCcenter 2000
architecture and implementation. Sun Microsystems, Inc., November
1993. Technical White Paper.];
SGI Power Challenge [SGI Power Challenge. Technical Report, Silicon
Graphics, 1995.];
DEC AlphaServer 8400 [D. M. Fenwick, D. J. Foley, W. B. Gist, S. R.
VanDoren, and D. Wissel. The AlphaServer 8000 series: High-end server
platform development. Digital Technical Journal, 7:43-65, 1995.];
and Cray C90/J90 [The Cray C90 series.
http://www.cray.com/PUBLIC/productinfo/C90/. Cray Research, Inc.;
The Cray J90 series. http://www.cray.com/PUBLIC/product-info/J90/.
Cray Research, Inc.].

12 / 52
ScaLAPACK is suitable for distributed-memory parallel computers, such
as the
IBM SP-2 [The IBM SP-2.
http://www.rs6000.ibm.com/software/sp products/sp2.html. IBM.],
Intel Paragon [The Intel Paragon,
http://www.ssd.intel.com/homepage.html. Intel.];
Cray T3 series [The Cray T3E series.
http://www.cray.com/PUBLIC/product-info/T3E/. Cray Research, Inc.];
networks of workstations [A. Anderson, D. Culler, D. Patterson, and the
NOW Team. A case for networks of workstations: NOW. IEEE Micro,
15(l):54-64, February 1995].

13 / 52
These libraries are available on NETLIB, including comprehensive
manuals [E. Anderson,et al., LAPACK Users’ Guide (2nd edition). SIAM,
Philadelphia, 1995; L. S. Blackford, J. Choi, A. Cleary, E. D’Azevedo, J.
Demmel, et al., ScaLAPACK Users’ Guide. Software, Environments, and
Tools 4. SIAM, Philadelphia, PA, 1997].
LAPACK was originally motivated by the poor performance of its
predecessors LINPACK and EISPACK (also available on NETLIB) on
some high-performance machines. For example, consider the table below,
which presents the speed in Mflops of LINPACK’s Cholesky routine spofa
on a Cray YMP, a supercomputer of the late 1980s. Cholesky is a variant
of Gaussian elimination suitable for symmetric positive definite matrices.
It is very similar to Algorithm 2.2. The table also includes the speed of
several other linear algebra operations. The Cray YMP is a parallel
computer with up to 8 processors that can be used simultaneously, so we
include one column of data for 1 processor and another column where all
8 processors are used.

14 / 52
1 Proc. 8 Proc.
Maximum speed 330 2640
Matrix-matrix multiply (n = 500) 312 2425
Matrix-vector multiply (n = 500) 311 2285
Solve TX = B (n = 500) 309 2398
Solve Tx = b (n = 500) 272 584
LINPACK (Cholesky, n = 500) 72 72
LAPACK (Cholesky, n = 500) 290 1414
LAPACK (Cholesky, n = 1000) 301 2115
The top line, the maximum speed of the machine, is an upper bound on
the numbers that follow. The basic linear algebra operations on the next
four lines have been measured using subroutines especially designed for
high speed on the Cray YMP. They all get reasonably close to the
maximum possible speed, except for solving Tx = b, a single triangular
system of linear equations, which does not use 8 processors effectively.
Solving TX = B refers to solving triangular systems with many
right-hand sides (B is a square matrix). These numbers are for large
matrices and vectors (n = 500).

15 / 52
Basic Linear Algebra Subroutines (BLAS)

Since it is not cost-effective to write a special version of every routine like

Cholesky for every new computer, we need a more systematic approach.
Since operations like matrix-matrix multiplication are so common,
computer manufacturers have standardized them as the Basic Linear
Algebra Subroutines, or BLAS and optimized them for their machines.
C. Lawson, R. Hanson, D. Kincaid, and F. Krogh. Basic Linear Algebra
Subprograms for Fortran usage. ACM Trans. Math. Software, 5:308-323,
1979.
J. Dongarra, J. Du Croz, S. Hammarling, and R. J. Hanson. An extended
set of FORTRAN Basic Linear Algebra Subroutines. ACM Trans. Math.
Software, 14:1-17, 1988.
J. Dongarra, J. Du Croz, I. Duff, and S. Hammarling. A set of Level 3
Basic Linear Algebra Subprograms. ACM Trans. Math. Software,
16:1-17, 1990.

16 / 52
In othe r words, a library of subroutines for matrix-matrix multiplication,
matrix-vector multiplication, and other similar operations is available with
a standard Fortran or C interface on high performance machines (and
many others), but underneath they have been optimized for each
machine. Our goal is to take advantage of these optimized BLAS by
reorganizing algorithms like Cholesky so that they call the BLAS to
perform most of their work.
Table 2.1 counts the number of memory references and floating points
operations performed by three related BLAS. For example, the number of
memory references needed to implement the saxpy operation in line 1 of
the table is 3n + 1, because we need to read n values of xi , n values of yi ,
and 1 value of α from slow memory to registers, and then write n values
of yi back to slow memory. The last column gives the ratio q of flops to
memory references (its highest-order term in n only).

17 / 52
The significance of q is that it tells us roughly how many flops that we
can perform per memory reference or how much useful work we can do
compared to the time moving data. This tells us how fast the algorithm
can potentially run. For example, suppose that an algorithm performs f
floating points operations, each of which takes tarith seconds, and m
memory references, each of which takes tmem seconds. Then the total
running time is as large as

m tmem 1 tmem
f · tarith + m · tmem = f · tarith · 1 + = f · tarith · 1 + ,
f tarith q tarith

assuming that the arithmetic and memory references are not performed in
parallel. Therefore, the larger the value of q, the closer the running time
is to the best possible running time f · tarith , which is how long the
algorithm would take if all data were in registers. This means that
algorithms with the larger q values are better building blocks for other
algorithms.

18 / 52
Table 2.1 reflects a hierarchy of operations: Operations such as saxpy
perform O(n1) flops on vectors and offer the worst q values; these are
called Level 1 BLAS, or BLAS1 [C. Lawson, R. Hanson, D. Kincaid, and
F. Krogh. Basic Linear Algebra Subprograms for Fortran usage. ACM
Trans. Math. Software, 5:308-323, 1979], and include inner products,
multiplying a scalar times a vector and other simple operations.
Operations such as matrix-vector multiplication perform O(n2 ) flops on
matrices and vectors and offer slightly better q values; these are called
Level 2 BLAS, or BLAS2 [J. Dongarra, J. Du Croz, S. Hammarling, and R. J. Hanson. Algorithm
656: An extended set of FORTRAN Basic Linear Algebra Subroutines. ACM Trans. Math. Software, 14:18-32,

1988; J. Dongarra, J. Du Croz, S. Hammarling, and R. J. Hanson. An extended set of FORTRAN Basic Linear

Algebra Subroutines. ACM Trans. Math. Software, 14:1-17, 1988], and include solving triangular systems of

equations and rank-1 updates of matrices (A + xy T , x and y column vectors). Operations such as matrix-matrix

multiplication perform O(n3 ) flops on pairs of matrices and offer the best q values; these are called Level 3 BLAS,

or BLAS3 [J. Dongarra, J. Du Croz, I. Duff, and S. Hammarling. Algorithm 679: A set of Level 3 Basic Linear

Algebra Subprograms. A CM Trans. Math.Software, 16:18-28, 1990; J. Dongarra, J. Du Croz, I. Duff, and S.

Hammarling. A set of Level 3 Basic Linear Algebra Subprograms. ACM Trans. Math. Software, 16:1-17, 1990],

and include solving triangular systems of equations with many right-hand sides.

19 / 52
Table 2.1. Counting floating point operations and memory references for
the BLAS. f is the number of floating point operations, and m is the
number of memory references.
Operation Definition f m q = f /m
saxpy y = α · x + y or 2n 3n + 1 2/3
(BLAS1) yi = α · xi + yi
i = 1, . . . , n
Matrix-vector mult y = AP· x + y or 2n2 n2 + 3n 2
n
(BLAS2) yi = j=1 aij xj + yi
i = 1, . . . , n
Matrix-matrix mult C = AP· B + C or 2n3 4n2 n/2
n
(BLAS3) cij = k=1 aik bjk + cij
i, j = 1, . . . , n

The directory NETLIB/blas includes documentation and (unoptimized)

implementations of all the BLAS. For a quick summary of all the BLAS,
see NETLIB/blas/blasqr.ps.

20 / 52
How to Optimize Matrix Multiplication

Let us examine in detail how to implement matrix multiplication

C = A · B + C to minimize the number of memory moves and so
optimize its performance. We will see that the performance is sensitive to
the implementation details. To simplify our discussion, we will use the
following machine model. We assume that matrices are stored
columnwise, as in Fortran. (It is easy to modify the examples below if
matrices are stored rowwise as in C.) We assume that there are two levels
of memory hierarchy, fast and slow, where the slow memory is large
enough to contain the three n × n matrices A, B, and C , but the fast
memory contains only M words where 2n < M ≪ n2 ; this means that the
fast memory is large enough to hold two matrix columns or rows but not
a whole matrix. We further assume that the data movement is under
programmer control. (In practice, data movement may be done
automatically by hardware, such as the cache controller. Nonetheless, the
basic optimization scheme remains the same.)

21 / 52
The simplest matrix-multiplication algorithm that one might try consists
of three nested loops, which we have annotated to indicate the data
movements.
ALGORITHM 2.6. Unblocked matrix multiplication (annotated to
indicate memory activity):
for i = 1 to n
{ Read row i of A into fast memory}
for j = 1 to n
{ Read Cij into fast memory}
{ Read column j of B into fast memory}
for k = 1 to n
Cij = Cij + Aik · Bkj
end for
{ Write Cij back to slow memory}
end for
end for

22 / 52
Here is the detailed count of memory references: n3 for reading B n
times (once for each value of i); n2 for reading A one row at a time and
keeping it in fast memory until it is no longer needed; and 2n2 for reading
one entry of C at a time, keeping it in fast memory until it is completely
computed, and then moving it back to slow memory. This comes to
n3 + 3n2 memory moves, or q = 2n3 /(n2 + 3n2 ) ≈ 2, which is no better
than the Level 2 BLAS and far from the maximum possible n/2 (see
Table 2.1). If M ≪ n, so that we cannot keep a full row of A in fast
memory, q further decreases to 1, since the algorithm reduces to a
sequence of inner products, which are Level 1 BLAS. For every
permutation of the three loops on i, j, and k, one gets another algorithm
with q about the same.
Our preferred algorithm uses blocking, where C is broken into an N × N
block matrix with n/N × n/N blocks C ij , and A and B are similarly
partitioned, as shown below for N = 4.

23 / 52
ALGORITHM 2.7. Blocked matrix multiplication (annotated to
indicate memory activity):
for i = 1 to N
for j = 1 to N
{ Read C ij into fast memory}
for k = 1 to N
{ Read Aik into fast memory}
{ Read B kj into fast memory}
C ij = C ij + Aik · B kj
end for
{ Write C ij back to slow memory}
end for
end for

24 / 52
The number of memory references

Our memory reference count is as follows: 2n2 for reading and writing
each block of C once, Nn2 for reading A N times (reading each
n/N − by − n/N submatrix Aik N 3 times), and Nn2 for reading B N
times (reading each n/N − by − n/N submatrix B kj N 3 times), for a
total of (2N + 2)n2 ≈ 2Nn2 memory references.
So we want to choose N as small as possible to minimize the number of
memory references. But N is subject to the constraint M ≥ 3(n/N)2 ,
which means that one block each from A, B, and C must fit in fast
memory simultaneously.
p
This yields N ≈ n 3/M and so q ≈ (2n3 )/(2Nn2 ), which is much better
than the previous algorithm.

25 / 52
The number of memory references

In particular q grows independently of n as M grows, which means that

we expect the algorithm to be fast for any matrix size n and to go faster
if the fast memory size M is increased. These are both attractive
properties.
In fact, it can be shown that Algorithm 2.7 is asymptotically optimal [X.
Hong and H. T. Kung. I/O complexity: The red blue pebble game. In
Proc. of the 13th Symposium on the Theory of Computing, ACM, New
York, 1981]. In other words, no reorganization of matrix-matrix
multiplication (that performs
√ the same 2n3 arithmetic operations) can
have a q larger than O M.

26 / 52
The number of memory references

On the other hand, this brief analysis ignores a number of practical

issues:
1 A real code will have to deal with nonsquare matrices, for
which the optimal block sizes may not be square.
2 The cache and register structure of a machine will strongly
affect the best shapes of submatrices.
3 There may be special hardware instructions that perform both
a multiplication and an addition in one cycle. It may also be
possible to execute several multiply-add operations
simultaneously if they do not interfere.

27 / 52
Both the above matrix-matrix multiplication algorithms
perform 2n3 arithmetic operations.
It turns out that there are other implementations of
matrix-matrix multiplication that use far fewer operations.
Strassen’s method [A. Aho, J. Hopcroft, and J. Ullman. The
Design and Analysis of Computer Algorithms.
Addison-Wesley, Reading, MA, 1974(0)

was the first of these algorithms to be discovered and is the

simplest to explain.
This algorithm multiplies matrices recursively by dividing them into
2 × 2 block matrices and multiplying the subblocks using seven
matrix multiplications (recursively) and 18 matrix additions of half
the size; this leads to an asymptotic complexity of nlog2 7 ≈ n2.81
instead of n3 .

28 / 52
ALGORITHM 2.8. Strassen’s matrix multiplication algorithm:
C = Strassen(A, B, n)
/* Return C = A ∗ B, where A and B are n-by-n;
Assume n is a power of 2 */
if n = 1
return C = A ∗ B /* scalar multiplication */
else » – » –
A11 A12 B11 B12
Partition A = and B =
A21 A22 B21 B22
where the subblocks Aij and Bij are n/2-by-n/2
P1 = Strassen( A12 − A22 , B21 + B22 , n/2 )
P2 = Strassen( A11 + A22 , B11 + B22 , n/2 )
P3 = Strassen( A11 − A21 , B11 + B12 , n/2 )
P4 = Strassen( A11 + A12 , B22 , n/2 )
P5 = Strassen( A11 , B12 − B22 , n/2 )
P6 = Strassen( A22 , B21 − B11 , n/2 )
P7 = Strassen( A21 + A22 , B11 , n/2 )
C11 = P1 + P2 − P4 + P6
C12 = P4 + P5
C21 = P6 + P7
C22 = P2 − »P3 + P5 − P7 –
C11 C12
return C =
C21 C22
end if

29 / 52
Complexity of Strassen’s algorithm

It is straightforward to confirm by induction that this algorithm

multiplies matrices correctly.
To show that its complexity is O(nlog2 7 ), we let T (n) be the
number of additions, subtractions, and multiplications performed by
the algorithm. Since the algorithm performs 7 recursive calls on
matrices of size n/2, and 18 additions of n/2-by-n/2 matrices, we
can write down the recurrence T (n) = 7T (n/2) + 18(n/2)2 .
Changing variables from n to m = log2 n, we get a new recurrence
T̄ (m) = 7T̄ (m − 1) + 18(2m−1 )2 , where T̄ (m) = T (2m ). We can
confirm that this linear recurrence for T̄ has a solution
T̄ (m) = O(7m ) = O(nlog2 7 ).

30 / 52
Special Linear Systems

It is important to exploit any special structure of the matrix to increase

speed of solution and decrease storage. We will consider only real
matrices:
s.p.d. matrices,
symmetric indefinite matrices,
band matrices,
general sparse matrices,
dense matrices depending on fewer than n2 independent parameters.

31 / 52
2.7.1. Real Symmetric Positive Definite Matrices

Recall that a real matrix A is s.p.d. if and only if A = AT and x T Ax > 0

for all x 6= 0. In this section we will show how to solve Ax = b in half the
time and half the space of Gaussian elimination when A is s.p.d.
PROPOSITION 2.2.
1. If X is nonsingular, then A is s.p.d. if and only if X T AX is s.p.d.
2. If A is s.p.d. and H is any principal submatrix of A(H = A(j : k, j : k)
for some j ≤ k), then H is s.p.d.
3. A is s.p.d. if and only if A = AT and all its eigenvalues are positive.
4. If A is s.p.d., then all aii > 0, and maxij |aij | = maxi aii > 0.
5. A is s.p.d. if and only if there is a unique lower triangular nonsingular
matrix L, with positive diagonal entries, such that A = LLT . A = LLT is
called the Cholesky factorization of A, and L is called the Cholesky factor
of A.

32 / 52
Proof.
1. If X is nonsingular, then A is s.p.d. if and only if X T AX is s.p.d.
X nonsingular implies Xx 6= 0 for all x 6= 0, so x T X T AXx > 0 for
all x 6= 0. So A s.p.d. implies X T AX is s.p.d. Use X −1 to deduce
the other implication.

33 / 52
2. If A is s.p.d. and H is any principal submatrix of
A(H = A(j : k, j : k) for some j ≤ k), then H is s.p.d.
Suppose first that H = A(1 : m, 1 : m). Then given any m-vector
y , the n-vector x = [y T , O]T satisfies y T Hy = x T Ax. So if
x T Ax > 0 for all nonzero x, then y T Hy > 0 for all nonzero y , and
so H is s.p.d. If H does not lie in the upper left corner of A, let P
be a permutation so that H does lie in the upper left corner of
P T AP and apply Part 1.

34 / 52
3. A is s.p.d. if and only if A = AT and all its eigenvalues are
positive.
Let X be the
V real, orthogonal eigenvector matrix of A so that
X TV
AX = P is the diagonal matrix of real eigenvalues λi . Since
x T x = i λi xi2 ,
V
is s.p.d. if and only if each λi > 0. Now
apply Part 1.

35 / 52
4. If A is s.p.d., then all aii > 0, and maxij |aij | = maxi aii > 0.
Let ei be the i th column of the identity matrix. Then
eiT Aei = aii > 0 for all i . If |akl | = maxij |aij | but k 6= l , choose
x = ek − sign(akl )ei . Then x T Ax = akk + all − 2|akl | ≤ 0,
contradicting positive-definiteness.

36 / 52
5. A is s.p.d. if and only if there is a unique lower triangular
nonsingular matrix L, with positive diagonal entries, such that
A = LLT . A = LLT is called the Cholesky factorization of A, and L
is called the Cholesky factor of A.
Suppose A = LLT with L nonsingular. Then
x T Ax = (x T L)(LT x) = ||LT x||22 > 0 for all x 6= 0, so A is s.p.d. If
A is s.p.d., we show that L exists by induction on the dimension n.
If we choose each lii > 0, our construction will determine L
√
uniquely. If n = 1, choose l11 = a11 , which exists since a11 > 0.
As with Gaussian elimination, it suffices to understand the block
2-by-2 case.

37 / 52
Write

a11 A12
A =
AT A22
" √12 # " √ #
a11 0 A12
1 0 a11 √
a11
= AT
√ 12
I 0 Ã22 0 I
" a11 #
a11 A12
= T AT A ,
A12 Ã22 + 12a11 12

AT
12 A12
so the (n − 1)-by-(n − 1) matrix Ã22 − a11 is symmetric.

38 / 52

1 0
By Part 1 above, is s.p.d, so by Part 2 Ã22 is s.p.d.
0 Ã22
Thus by induction there exists an L̃ such that Ã22 = L̃L̃T and
" √ # " √ #
a11 0 1 0 a11 √Aa1211
A = AT
√ 12
a11 I 0 L̃L̃T 0 I
" √ #" √ #
a11 0 a11 √Aa1211
= AT T
≡ LLT .
√ 12
a11 L̃ 0 L̃

39 / 52
We may rewrite this induction as the following algorithm.
ALGORITHM 2.11. Cholesky algorithm:
for j = 1 to nP
ljj = (ajj − j−1 2 1/2
k=1 ljk )
for i = j + 1 toPn
lij = (aij − j−1
k=1 lik ljk )/ljj
end for
end for
If A is not positive definite, then (in exact arithmetic) this
algorithm will fail by attempting to compute the square root of a
negative number or by dividing by zero; this is the cheapest way to
test if a symmetric matrix is positive definite.

40 / 52
The number of flops in Cholesky algorithm

As with Gaussian elimination, L can overwrite the lower half of A.

Only the lower half of A is referred to by the algorithm, so in fact
only n(n + l )/2 storage is needed instead of n2 . The number of
flops is  
n n
X X 1
2j + 2j  = n3 + O(n2 ),
3
j=1 i =j+1

or just half the flops of Gaussian elimination. Just as with

Gaussian elimination, Cholesky may be reorganized to perform
most of its floating point operations using Level 3 BLAS; see
LAPACK routine spotrf.

41 / 52
Pivoting is not necessary for Cholesky to be numerically stable
(equivalently, we could also say any diagonal pivot order is
numerically stable). We show this as follows. The same analysis as
for Gaussian elimination in section 2.4.2 shows that the computed
solution x̂ satisfies (A + δA)x̂ = b with |δA| ≤ 3nε|L| · |LT |. But
by the Cauchy-Schwartz inequality and Part 4 of Proposition 2.2
X
(|L| · |LT |)ij = |lik | · |ljk |
qk P qP
≤ lik2 ljk2
√ √
= aii · ajj
≤ maxij |ajj |

so || |L| · |LT | ||∞ ≤ n||A||∞ and ||δA||∞ ≤ 3n2 ε||A||∞ .

42 / 52
Symmetric Indefinite Matrices

The question of whether we can still save half the time and half the
space when solving a symmetric but indefinite (neither positive definite
nor negative definite) linear system naturally arises. It turns out to be
possible, but a more complicated pivoting scheme and factorization is
required. If A is nonsingular, one can show that there exists a
permutation P, a unit lower triangular matrix L, and a block diagonal
matrix D with 1-by-1 and 2-by-2 blocks such that PAP T = LDL T
.
0 1
To see why 2-by-2 blocks are needed in D, consider the matrix .
1 0
This factorization can be computed stably, saving about half the work
and space compared to standard Gaussian elimination. The name of the
LAPACK subroutine which does this operation is ssysv. The algorithm is
described in [J. Bunch and L. Kaufman. Some stable methods for
calculating inertia and solving symmetric linear systems. Math. Comp.,
31:163-179, 1977].

43 / 52
Band Matrices

A matrix A is called a band matrix with lower bandwidth bL , and upper

bandwidth bU if aij = 0 whenever i > j + bL or i < j − bU :
 
a11 ··· a1,bU +1 0
 .. 

 . a2,bU +2 

 .. 
A= L
 ab +1,1 . .


 a bL +2,2 a n−bU ,n 

 . .. .
..

 
0 an,n−bL ··· an,n

Band matrices arise often in practice and are useful to recognize because
their L and U factors are also ”essentially banded”, making them cheaper
to compute and store. We consider LU factorization without pivoting
and show that L and U are banded in the usual sense, with the same
band widths as A.

44 / 52
PROPOSITION 2.3. Let A be banded with lower bandwidth bL and
upper bandwidth bU . Let A = LU be computed without pivoting. Then
L has lower bandwidth bL and U has upper bandwidth bU . L and U can
be computed in about 2n · bU · bL arithmetic operations when bU and bL
are small compared to n. The space needed is (bL + bU + 1). The full
cost of solving Ax = b is 2nbU · bL + 2nbU + 2nbL .
PROPOSITION 2.4. Let A be banded with lower bandwidth bL and
upper bandwidth bU . Then after Gaussian elimination with partial
pivoting, U is banded with upper bandwidth at most bL + bU , and L is
”essentially banded” with lower bandwidth bL . This means that L has at
most bL + 1 nonzeros in each column and so can be stored in the same
space as a band matrix with lower bandwidth bL .
Gaussian elimination and Cholesky for band matrices are available in
LAPACK routines like ssbsv and sspsv.
Band matrices often arise from discretizing physical problems with
nearest neighbor interactions on a mesh (provided the unknowns are
ordered rowwise or columnwise).

45 / 52
Example: ODE

EXAMPLE 2.8. Consider the ordinary differential equation (ODE)

y ′′(x) − p(x)y ′(x) − q(x)y (x) = r (x) on the interval [a, b] with boundary
conditions y (a) = α, y (b) = β. We also assume q(x) ≥ q > 0. This
equation may be used to model the heat flow in a long, thin rod, for
example. To solve the differential equation numerically, we discretize it
by seeking its solution only at the evenly spaced mesh points xi = a + ih,
i = 0, . . . , N + 1, where h = (b − a)/(N + 1) is the mesh spacing. Define
pi = p(xi ), ri = r (xi ), and qi = q(xi ).

46 / 52
We need to derive equations to solve for our desired
approximations yi ≈ y (xi ), where y0 = α and yN+1 = β. To derive
these equations, we approximate the derivative y ′(xi ) by the
following finite difference approximation:

yi +1 − yi −1
y ′(xi ) ≈ .
2h

(Note that as h gets smaller, the right-hand side approximates

y ′(xi ) more and more accurately.) We can similarly approximate
the second derivative by

yi +1 − 2yi + yi −1
y ′′(xi ) ≈ .
h2

Inserting these approximations into the differential equation yields

yi +1 − 2yi + yi −1 yi +1 − yi −1
2
− pi − qi yi = ri , 1 ≤ i ≤ N.
h 2h

47 / 52
Rewriting this as a linear system we get Ay = b, where
     1 h 
y1 r1 ( 2 + 4 p1 )α
     0 
−h2 
y =  ...
 ..   ..
    
, b = + ,
  
2  .   .

  
     0 
yN rN ( 12 − h4 pN )β

48 / 52
and
 
a1 −c1 h2
 .. ..  ai = 1+ 2 qi ,
 −b2 . . 
1 2
A=
 .. ..
,
 bi = 2 [1 + h2 pi ],
 . . cN−1  1 2
ci = 2 [1 − h2 pi ].
−bN aN

Note that ai > 0, and also bi > 0 and ci > 0 if h is small enough.
This is a nonsymmetric tridiagonal system to solve for y . We will show
how to change it to a symmetric positive definite tridiagonal system, so
that we may use band Cholesky to solve it.

49 / 52
q q q
c1 c1 c2 c1 c2 ···cN−1
Choose D = diag (1, b2 , b2 b3 , . . . , b2 b3 ···bN ). Then we may change
−1
Ay = b to (DAD )(Dy ) = Db or Ãỹ = b̃, where
 √ 
√a1 − c 1 b2 √
 − c 1 b2 a2 − c 2 b3 
√
 
 .. 
Ã = 
 − c 2 b3 . .

 . .. . .. p 

p − cN−1 bN 
− cN−1 bN aN

It is easy to see that Ã is symmetric, and it has the same eigenvalues as

A because A and Ã = DAD −1 are similar. We will use the next theorem
to show it is also positive definite.

50 / 52
Gershgorin’s Theorem

THEOREM 2.9. Gershgorin. Let B be an arbitrary matrix. Then the

eigenvalues λ of B are located in the union of the n disks
X
|λ − bkk | ≤ |bkj |.
j6=k

Proof. Given λ and x 6= 0 such that Bx = λx, let 1 = ||x||∞ = xk by

scaling x if necessary. Then N
P
j=1 bkj xj = λxk = λ, so
PN
λ − bkk = j =1 b x
kj j , implying
j 6= k

X
|λ − bkk | ≤ |bkj xj | ≤ |bkj |.
j6=k

51 / 52
Example: ODE (continuation)

If h is so small that for all i, | h2 pi | < 1, then

h2 h2

1 h 1 h
|bi |+|ci | = 1 + pi + 1 − pi = 1 < 1+ q ≤ 1+ qi = ai .
2 2 2 2 2 2

Therefore all eigenvalues of A lie inside the disks centered at

1 + h2 qi /2 ≥ 1 + h2 q/2 with radius 1; in particular, they must all
have positive real parts.
Since A is symmetric, its eigenvalues are real and hence positive, so
Ã is positive definite. Its smallest eigenvalue is bounded below by
qh2 /2.
Thus, it can be solved by Cholesky. The LAPACK subroutine for
solving a symmetric positive definite tridiagonal system is sptsv.

52 / 52

Lawrence Perko - Instructor's Solutions Manual To Differential Equations and Dynamical Systems (2001, Springer)
89% (9)
Lawrence Perko - Instructor's Solutions Manual To Differential Equations and Dynamical Systems (2001, Springer)
161 pages
Pma Exam
100% (1)
Pma Exam
358 pages
(Nicholas J. Higham) Accuracy and Stability of Num
100% (1)
(Nicholas J. Higham) Accuracy and Stability of Num
710 pages
Multivarible Math Williamson-Trotter
100% (2)
Multivarible Math Williamson-Trotter
764 pages
Cholesky Decomposition, Linear Algebra Libraries and Matlab Routine
No ratings yet
Cholesky Decomposition, Linear Algebra Libraries and Matlab Routine
33 pages
Cs421 Cheat Sheet
No ratings yet
Cs421 Cheat Sheet
2 pages
Numerical Solution of Linear Systems: Chen Greif
No ratings yet
Numerical Solution of Linear Systems: Chen Greif
59 pages
Improving The Accuracy of Computed Eigenvalues and Eigenvectors
No ratings yet
Improving The Accuracy of Computed Eigenvalues and Eigenvectors
23 pages
Outline of Next 2 Lectures: Matrix Computations: Direct Methods I
No ratings yet
Outline of Next 2 Lectures: Matrix Computations: Direct Methods I
16 pages
2.5 Iterative Improvement of A Solution To Linear Equations
No ratings yet
2.5 Iterative Improvement of A Solution To Linear Equations
5 pages
hpc_linear
No ratings yet
hpc_linear
52 pages
2.5 Iterative Improvement of A Solution To Linear Equations
No ratings yet
2.5 Iterative Improvement of A Solution To Linear Equations
5 pages
03a1 MIT18 - 409F09 - Scribe21
No ratings yet
03a1 MIT18 - 409F09 - Scribe21
8 pages
Current Trends in Numerical Linear Algebra
No ratings yet
Current Trends in Numerical Linear Algebra
19 pages
Chapter_4
No ratings yet
Chapter_4
27 pages
HW 3 Solutions Summer 2010
No ratings yet
HW 3 Solutions Summer 2010
6 pages
Linear Algebra Cheat Sheet
No ratings yet
Linear Algebra Cheat Sheet
2 pages
the notesaasdfsadf
No ratings yet
the notesaasdfsadf
44 pages
hpc_iterative
No ratings yet
hpc_iterative
106 pages
Lec 3 Printed
No ratings yet
Lec 3 Printed
136 pages
Numerical Algorithms Methods for Computer Vision Machine Learning and Graphics 1st Solomon Solution Manualinstant download
100% (3)
Numerical Algorithms Methods for Computer Vision Machine Learning and Graphics 1st Solomon Solution Manualinstant download
49 pages
Layton W., Sussman M. Numerical Linear Algebra 2020
No ratings yet
Layton W., Sussman M. Numerical Linear Algebra 2020
274 pages
G. W. Stewart - Matrix Algorithms-Society For Industrial and Applied Mathematics (1998)
No ratings yet
G. W. Stewart - Matrix Algorithms-Society For Industrial and Applied Mathematics (1998)
479 pages
LAPACK Users Guide PDF
No ratings yet
LAPACK Users Guide PDF
425 pages
Linear Equation Numerical Solution Lanjut
No ratings yet
Linear Equation Numerical Solution Lanjut
66 pages
Download complete Accuracy and Stability of Numerical Algorithms Second Edition Nicholas J. Higham ebook PDF file all chapters
No ratings yet
Download complete Accuracy and Stability of Numerical Algorithms Second Edition Nicholas J. Higham ebook PDF file all chapters
67 pages
Numerical Linear Algebra and Applications 2nd Edition Biswa Nath Datta - Read the ebook now with the complete version and no limits
No ratings yet
Numerical Linear Algebra and Applications 2nd Edition Biswa Nath Datta - Read the ebook now with the complete version and no limits
43 pages
Lecture 16: Linear Algebra III: cs412: Introduction To Numerical Analysis
No ratings yet
Lecture 16: Linear Algebra III: cs412: Introduction To Numerical Analysis
7 pages
Numerical Analysis - I. Jacques and C. Judd PDF
No ratings yet
Numerical Analysis - I. Jacques and C. Judd PDF
109 pages
Solving Linear Systems: Iterative Methods and Sparse Systems
No ratings yet
Solving Linear Systems: Iterative Methods and Sparse Systems
24 pages
SciCom LecNotes
No ratings yet
SciCom LecNotes
28 pages
Lecture06 2
No ratings yet
Lecture06 2
26 pages
S Ccs Answers
No ratings yet
S Ccs Answers
192 pages
Solutions Manual Scientific Computing
0% (1)
Solutions Manual Scientific Computing
192 pages
Matrixeqs Lyapunov
No ratings yet
Matrixeqs Lyapunov
11 pages
1 Solving Systems of Linear Equations: Gaussian Elimination: Lecture 9: October 26, 2021
No ratings yet
1 Solving Systems of Linear Equations: Gaussian Elimination: Lecture 9: October 26, 2021
8 pages
Solvingsingular Linear Equation
No ratings yet
Solvingsingular Linear Equation
49 pages
Numerical Methods for Least Squares Problems, Second Edition
No ratings yet
Numerical Methods for Least Squares Problems, Second Edition
510 pages
BIOENG 1330/2330 Biomedical Imaging FALL 2015: Sowmya Aggarwal Ker-Jiun Wang University of Pittsburgh
No ratings yet
BIOENG 1330/2330 Biomedical Imaging FALL 2015: Sowmya Aggarwal Ker-Jiun Wang University of Pittsburgh
61 pages
2 Partial Pivoting, LU Factorization: 2.1 An Example
No ratings yet
2 Partial Pivoting, LU Factorization: 2.1 An Example
11 pages
COL726_A1-Solutions (1)
No ratings yet
COL726_A1-Solutions (1)
8 pages
Complete Download of Numerical Algorithms Methods for Computer Vision Machine Learning and Graphics 1st Solomon Solution Manual Full Chapters in PDF DOCX
100% (9)
Complete Download of Numerical Algorithms Methods for Computer Vision Machine Learning and Graphics 1st Solomon Solution Manual Full Chapters in PDF DOCX
48 pages
Lec 3
No ratings yet
Lec 3
29 pages
How To Measure Errors
No ratings yet
How To Measure Errors
8 pages
Get Numerical Linear Algebra and Applications 2nd Edition Biswa Nath Datta PDF ebook with Full Chapters Now
100% (1)
Get Numerical Linear Algebra and Applications 2nd Edition Biswa Nath Datta PDF ebook with Full Chapters Now
51 pages
Numerical Methods For Partial Differential Algebraic Systems of Equations
No ratings yet
Numerical Methods For Partial Differential Algebraic Systems of Equations
61 pages
Iterative Methods For Linear Systems: Course Website
No ratings yet
Iterative Methods For Linear Systems: Course Website
24 pages
Numerical Algorithms Methods for Computer Vision Machine Learning and Graphics 1st Solomon Solution Manual - Download PDF
100% (5)
Numerical Algorithms Methods for Computer Vision Machine Learning and Graphics 1st Solomon Solution Manual - Download PDF
51 pages
Fundamentals of Matrix Computations 2nd Edition David S. Watkins - The latest ebook is available, download it today
100% (2)
Fundamentals of Matrix Computations 2nd Edition David S. Watkins - The latest ebook is available, download it today
53 pages
CS515 Homework 2
No ratings yet
CS515 Homework 2
41 pages
Coordinate Descent Algorithms: Stephen J. Wright
No ratings yet
Coordinate Descent Algorithms: Stephen J. Wright
32 pages
Numerical Methods For Engineers
No ratings yet
Numerical Methods For Engineers
488 pages
6.2 Iterative Methods: C 2006 Gilbert Strang
No ratings yet
6.2 Iterative Methods: C 2006 Gilbert Strang
7 pages
Problem Discretization Approximation Theory Revised
No ratings yet
Problem Discretization Approximation Theory Revised
76 pages
2.4 Solving Systems of Linear Equations
No ratings yet
2.4 Solving Systems of Linear Equations
27 pages
Error Analysis Lectures 18 19
No ratings yet
Error Analysis Lectures 18 19
32 pages
Lecture Notes On Problem Discretization Using Approximation Theory
No ratings yet
Lecture Notes On Problem Discretization Using Approximation Theory
73 pages
Linear Algebra: Assignment I
No ratings yet
Linear Algebra: Assignment I
11 pages
Module 4: Solving Linear Algebraic Equations Section 5: Iterative Solution Techniques
No ratings yet
Module 4: Solving Linear Algebraic Equations Section 5: Iterative Solution Techniques
11 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Speed Mathamatics
From Everand
Speed Mathamatics
Naila Hina
1/5 (1)
Introduction to Calculus
From Everand
Introduction to Calculus
Joan Van Glabek
4.5/5 (8)
Parviz Moin - Fundamentals of Engineering Numerical Analysis (2010, Cambridge University Press)
No ratings yet
Parviz Moin - Fundamentals of Engineering Numerical Analysis (2010, Cambridge University Press)
17 pages
[Translations of Mathematical Monographs 26] Gennadiĭ Mikhaĭlovich Goluzin - Geometric Theory of Functions of a Complex Variable (Translations of Mathematical Monographs, Vol. 26) (1969, AMS Bookstore)
100% (1)
[Translations of Mathematical Monographs 26] Gennadiĭ Mikhaĭlovich Goluzin - Geometric Theory of Functions of a Complex Variable (Translations of Mathematical Monographs, Vol. 26) (1969, AMS Bookstore)
684 pages
In Num Anal
No ratings yet
In Num Anal
185 pages
Matrix Computations (3rd Ed.,1996) - 2
No ratings yet
Matrix Computations (3rd Ed.,1996) - 2
723 pages
Matrices
No ratings yet
Matrices
60 pages
Machine Learning With Python Cookbook 2e Preview
No ratings yet
Machine Learning With Python Cookbook 2e Preview
5 pages
(Ebook) Linear Algebra: A First Course with Applications by Larry E. Knop ISBN 9781584887829, 9781584887836, 1584887826, 1584887834 - The latest ebook is available for instant download now
100% (2)
(Ebook) Linear Algebra: A First Course with Applications by Larry E. Knop ISBN 9781584887829, 9781584887836, 1584887826, 1584887834 - The latest ebook is available for instant download now
57 pages
Time: 3 Hours Maximum Marks: 100 Note: Question No. One Is Compulsory. Attempt Any Three Questions From Four
No ratings yet
Time: 3 Hours Maximum Marks: 100 Note: Question No. One Is Compulsory. Attempt Any Three Questions From Four
4 pages
Matlab Assignment PDF
No ratings yet
Matlab Assignment PDF
2 pages
Wavelet Lectures
No ratings yet
Wavelet Lectures
46 pages
Peterson-Gorenstein-Zierler Decoder: Q: If J Is Equal To 2, Not 1, How Do The Peterson's Method Work?
No ratings yet
Peterson-Gorenstein-Zierler Decoder: Q: If J Is Equal To 2, Not 1, How Do The Peterson's Method Work?
18 pages
Advanced Calculus Notes
No ratings yet
Advanced Calculus Notes
273 pages
Scipy (Python Library) : Prepared By: Jenish Patel Jesal Zala Kirtan Shah Sanyukta Gautam
No ratings yet
Scipy (Python Library) : Prepared By: Jenish Patel Jesal Zala Kirtan Shah Sanyukta Gautam
17 pages
Sem-3 (Vector Space)
No ratings yet
Sem-3 (Vector Space)
2 pages
Line codes_explanation
No ratings yet
Line codes_explanation
4 pages
- ملخص متجاهات مهم (extract1) PDF
No ratings yet
- ملخص متجاهات مهم (extract1) PDF
20 pages
The Fractional Fourier Transform and Its Applications
No ratings yet
The Fractional Fourier Transform and Its Applications
47 pages
Two Dimensional Transformations
No ratings yet
Two Dimensional Transformations
10 pages
Sparse Matrix
No ratings yet
Sparse Matrix
8 pages
Pedagogic Project Form 4 Maths 2024-2025 Lingam
No ratings yet
Pedagogic Project Form 4 Maths 2024-2025 Lingam
5 pages
Immediate download Oscillation Matrices and Kernels and Small Vibrations of Mechanical Systems F. R. Gantmacher ebooks 2024
100% (3)
Immediate download Oscillation Matrices and Kernels and Small Vibrations of Mechanical Systems F. R. Gantmacher ebooks 2024
81 pages
DILS-Totul Despre Cursuri
No ratings yet
DILS-Totul Despre Cursuri
72 pages
Algebra Lineal Y Teoria Matricial: Practico
No ratings yet
Algebra Lineal Y Teoria Matricial: Practico
1 page
Linear Algebra For Computer Science
No ratings yet
Linear Algebra For Computer Science
279 pages
Experiment 4 Eigenvalues and Diagonalization
No ratings yet
Experiment 4 Eigenvalues and Diagonalization
9 pages
Handbook of Linear Algebra Second Edition Leslie Hogben download pdf
100% (4)
Handbook of Linear Algebra Second Edition Leslie Hogben download pdf
81 pages
Skills Sheet F12 - Matrices
No ratings yet
Skills Sheet F12 - Matrices
2 pages
Square Root of Operator
No ratings yet
Square Root of Operator
8 pages
New NM QB
No ratings yet
New NM QB
18 pages
Mathematic Question Paper Set B
No ratings yet
Mathematic Question Paper Set B
4 pages
M 1permutation
No ratings yet
M 1permutation
6 pages
Paper of French
No ratings yet
Paper of French
2 pages
Chapter 3 IR
No ratings yet
Chapter 3 IR
34 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Applied Numerical Linear Algebra. Lecture 5

Uploaded by

Applied Numerical Linear Algebra. Lecture 5

Uploaded by

Applied Numerical Linear Algebra.

We have just seen that the error in solving Ax = b may be as large

To solve any equation f (x) = 0, we can try to use Newton’s method to

If we could compute r = Axi − b exactly and solve Ad = r exactly, we

d = (A + δA)−1 r = (I + A−1 δA)−1 A−1 r

||xi +1 − x|| ≤||A−1 δA(xi − x)|| + ||A−1 f ||

THEOREM 2.8. Suppose that r is computed in single precision and

maxi (|A| · |x|)i

Then one step of iterative refinement yields x1 such that

There is one more common technique for improving the error in

Changing the order of the three nested loops in the

Since it is not cost-effective to write a special version of every routine like

The directory NETLIB/blas includes documentation and (unoptimized)

Let us examine in detail how to implement matrix multiplication

In particular q grows independently of n as M grows, which means that

On the other hand, this brief analysis ignores a number of practical

was the first of these algorithms to be discovered and is the

It is straightforward to confirm by induction that this algorithm

It is important to exploit any special structure of the matrix to increase

Recall that a real matrix A is s.p.d. if and only if A = AT and x T Ax > 0

As with Gaussian elimination, L can overwrite the lower half of A.

or just half the flops of Gaussian elimination. Just as with

so || |L| · |LT | ||∞ ≤ n||A||∞ and ||δA||∞ ≤ 3n2 ε||A||∞ .

A matrix A is called a band matrix with lower bandwidth bL , and upper

EXAMPLE 2.8. Consider the ordinary differential equation (ODE)

(Note that as h gets smaller, the right-hand side approximates

Inserting these approximations into the differential equation yields

It is easy to see that Ã is symmetric, and it has the same eigenvalues as

THEOREM 2.9. Gershgorin. Let B be an arbitrary matrix. Then the

Proof. Given λ and x 6= 0 such that Bx = λx, let 1 = ||x||∞ = xk by

If h is so small that for all i, | h2 pi | < 1, then

Therefore all eigenvalues of A lie inside the disks centered at

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.