matrix_algorithms
matrix_algorithms
Timothy Vismor
March 1, 2012
Abstract
is document examines various aspects of matrix and linear algebra
that are relevant to the analysis of large scale networks. Particular emphasis
is placed on computational aspects of the topics of interest.
Contents
1 Matrix Nomenclature 6
2 Matrix Algebra 7
2.1 Matrix Equality . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Matrix Transposition . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Scalar Multiplication . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 Matrix Addition . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.5 Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . 10
2.6 Inverse of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . 11
2.7 Rank of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.8 Similarity Transformations . . . . . . . . . . . . . . . . . . . . 13
2.9 Partitioning a Matrix . . . . . . . . . . . . . . . . . . . . . . . 13
3 Linear Systems 14
3.1 Solving Fully Determined Systems . . . . . . . . . . . . . . . 15
3.2 Solving Underdetermined Systems . . . . . . . . . . . . . . . 16
3.3 Solving Overdetermined Systems . . . . . . . . . . . . . . . . 17
3.4 Computational Complexity of Linear Systems . . . . . . . . . 17
4 LU Decomposition 18
4.1 Gaussian Elimination . . . . . . . . . . . . . . . . . . . . . . 19
4.2 Doolittle’s LU Factorization . . . . . . . . . . . . . . . . . . . 19
4.3 Crout’s LU Factorization . . . . . . . . . . . . . . . . . . . . . 22
4.4 LDU Factorization . . . . . . . . . . . . . . . . . . . . . . . . 24
4.5 Numerical Instability During Factorization . . . . . . . . . . . 24
4.6 Pivoting Strategies for Numerical Stability . . . . . . . . . . . 26
4.7 Diagonal Dominance and Pivoting . . . . . . . . . . . . . . . 26
4.8 Partial Pivoting . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.9 Complete Pivoting . . . . . . . . . . . . . . . . . . . . . . . . 28
4.10 Computational Complexity of Pivoting . . . . . . . . . . . . . 29
4.11 Scaling Strategies . . . . . . . . . . . . . . . . . . . . . . . . . 29
2
CONTENTS CONTENTS
6 Factor Update 32
6.1 LDU Factor Update . . . . . . . . . . . . . . . . . . . . . . . 32
6.2 LU Factor Update . . . . . . . . . . . . . . . . . . . . . . . . 33
6.3 Additional Considerations . . . . . . . . . . . . . . . . . . . . 34
7 Symmetric Matrices 35
7.1 LDU Decomposition of Symmetric Matrices . . . . . . . . . . 35
7.2 LU Decomposition of Symmetric Matrices . . . . . . . . . . . 35
7.3 Symmetric Matrix Data Structures . . . . . . . . . . . . . . . 37
7.4 Doolittle’s Method for Symmetric Matrices . . . . . . . . . . . 38
7.5 Crout’s Method for Symmetric Matrices . . . . . . . . . . . . 38
7.6 Forward Substitution for Symmetric Systems . . . . . . . . . . 39
7.6.1 Forward Substitution Using Lower Triangular Factors . 40
7.6.2 Forward Substitution Using Upper Triangular Factors . 40
7.7 Backward Substitution for Symmetric Systems . . . . . . . . . 41
7.7.1 Back Substitution Using Upper Triangular Factors . . 42
7.7.2 Back Substitution Using Lower Triangular Factors . . 42
7.8 Symmetric Factor Update . . . . . . . . . . . . . . . . . . . . 42
7.8.1 Symmetric LDU Factor Update . . . . . . . . . . . . 43
7.8.2 Symmetric LU Factor Update . . . . . . . . . . . . . . 44
8 Sparse Matrices 45
8.1 Sparse Matrix Methodology . . . . . . . . . . . . . . . . . . . 46
8.2 Abstract Data Types for Sparse Matrices . . . . . . . . . . . . 46
8.2.1 Sparse Matrix Proper . . . . . . . . . . . . . . . . . . 47
8.2.2 Adjacency List . . . . . . . . . . . . . . . . . . . . . . 47
8.2.3 Reduced Graph . . . . . . . . . . . . . . . . . . . . . 48
8.2.4 List . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
8.2.5 Mapping . . . . . . . . . . . . . . . . . . . . . . . . . 50
8.2.6 Vector . . . . . . . . . . . . . . . . . . . . . . . . . . 50
8.3 Pivoting To Preserve Sparsity . . . . . . . . . . . . . . . . . . 50
8.3.1 Markowitz Pivot Strategy . . . . . . . . . . . . . . . . 51
8.3.2 Minimum Degree Pivot Strategy . . . . . . . . . . . . 51
8.4 Symbolic Factorization of Sparse Matrices . . . . . . . . . . . 52
8.4.1 Symbolic Factorization with Minimum Degree Pivot . 53
8.4.2 Computational Complexity of Symbolic Factorization 54
8.5 Creating 𝐏𝐀𝐏𝐓 from a Symbolic Factorization . . . . . . . . . 55
8.6 Numeric Factorization of Sparse Matrices . . . . . . . . . . . . 57
8.7 Solving Sparse Linear Systems . . . . . . . . . . . . . . . . . . 57
8.7.1 Permute the Constant Vector . . . . . . . . . . . . . . 59
3
LIST OF FIGURES LIST OF ALGORITHMS
9 Implementation Notes 63
9.1 Sparse Matrix Representation . . . . . . . . . . . . . . . . . . 65
9.2 Database Cache Performance . . . . . . . . . . . . . . . . . . 66
9.2.1 Sequential Matrix Element Retrieval . . . . . . . . . . 68
9.2.2 Arbitrary Matrix Element Retrieval . . . . . . . . . . . 68
9.2.3 Arbitrary Matrix Element Update . . . . . . . . . . . 68
9.2.4 Matrix Element Insertion . . . . . . . . . . . . . . . . 68
9.2.5 Matrix Element Deletion . . . . . . . . . . . . . . . . 69
9.2.6 Empirical Performance Measurements . . . . . . . . . 69
9.3 Floating Point Performance . . . . . . . . . . . . . . . . . . . 71
9.4 Auxiliary Store . . . . . . . . . . . . . . . . . . . . . . . . . . 73
List of Figures
1 Computational Sequence of Doolittle’s Method . . . . . . . . 21
2 Computational Sequence of Crout’s Method . . . . . . . . . . 23
3 Computational Sequence of Tinney’s LDU Decomposition . . 25
4 Matrix Tuple Structure . . . . . . . . . . . . . . . . . . . . . . 65
5 Sparse Matrix Representation . . . . . . . . . . . . . . . . . . 67
List of Tables
1 Database Cache Benchmarks . . . . . . . . . . . . . . . . . . 70
2 Floating Point Benchmarks . . . . . . . . . . . . . . . . . . . 71
3 Math Library Benchmarks . . . . . . . . . . . . . . . . . . . . 72
List of Algorithms
1 LU Decomposition . . . . . . . . . . . . . . . . . . . . . . . . 20
2 Doolittle’s LU Decompostion . . . . . . . . . . . . . . . . . . 21
3 Crout’s LU Decomposition . . . . . . . . . . . . . . . . . . . 23
4 Forward Substitution . . . . . . . . . . . . . . . . . . . . . . . 30
4
LIST OF ALGORITHMS LIST OF ALGORITHMS
5 Backward Substitution . . . . . . . . . . . . . . . . . . . . . . 31
6 Forward Substitution - Outer Product . . . . . . . . . . . . . 31
7 Back Substitution - Outer Product . . . . . . . . . . . . . . . 31
8 LDU Factor Update . . . . . . . . . . . . . . . . . . . . . . . 33
9 LU Factor Update . . . . . . . . . . . . . . . . . . . . . . . . 34
10 Doolittle’s Method - Symmetric Implementation . . . . . . . . 38
11 Doolittle’s Method - Symmetric, Array Based . . . . . . . . . . 39
12 Crout’s Method - Symmetric Implementation . . . . . . . . . 39
13 Crout’s Method - Symmetric, Array Based . . . . . . . . . . . 40
14 Symmetric Forward Substitution via Upper Triangular Factors 41
15 Symmetric Forward Substitution using 𝐔 with Array Storage . 41
16 Symmetric Forward Substitution using 𝐔, Outer Product . . . 41
17 Symmetric Forward Substitution using 𝐔, Outer Product, Array 42
18 Symmetric Back Substitution using Lower Triangular Factors . 43
19 Symmetric Backward Substitution using 𝐋 with Array Storage . 43
20 Symmetric LDU Factor Update . . . . . . . . . . . . . . . . . 44
21 Symmetric LU Factor Update . . . . . . . . . . . . . . . . . . 45
22 Symbolic Factorization of a Sparse Matrix . . . . . . . . . . . 54
23 Construct 𝐏𝐀𝐏𝐓 of a Sparse Matrix . . . . . . . . . . . . . . . 56
24 Construct 𝐏𝐀𝐏𝐓 of a Sparse Symmetric Matrix . . . . . . . . . 56
25 LU Decomposition of a Sparse Matrix . . . . . . . . . . . . . 58
26 LU Decomposition of a Sparse Symmetric Matrix . . . . . . . 58
27 Permute 𝐛 to order 𝐏 . . . . . . . . . . . . . . . . . . . . . . . 60
28 Sparse Forward Substitution . . . . . . . . . . . . . . . . . . . 60
29 Sparse Forward Substitution - Outer Product . . . . . . . . . . 60
30 Sparse Back Substitution . . . . . . . . . . . . . . . . . . . . . 61
31 Permute 𝐱 to order 𝐐 . . . . . . . . . . . . . . . . . . . . . . . 61
32 Factorization Path . . . . . . . . . . . . . . . . . . . . . . . . 62
33 Symmetric Factorization Path . . . . . . . . . . . . . . . . . . 63
34 Structurally Symmetric Sparse LU Factor Update . . . . . . . . 64
35 Symmetric Sparse LU Factor Update . . . . . . . . . . . . . . 64
5
1 MATRIX NOMENCLATURE
1 Matrix Nomenclature
Since any nite dimensional linear operator can be represented as a matrix, ma-
trix algebra and linear algebra are two sides of the same coin. Properties of linear
systems are gleaned from either discipline. e following sections draw on both
of these perspectives to examine the basic concepts, numerical techniques, and
practical constraints of computational linear algebra.
Assuming the symbols 𝑥𝑖 represent variables and the symbols 𝑎𝑖𝑗 and 𝑏𝑖 are
complex constants, the following is a system of 𝑚 linear equations in 𝑛 un-
knowns.
𝐀𝐱 = 𝐛 (2)
where
6
2 MATRIX ALGEBRA
⎛ 1 𝑎12 𝑎13 ⎞
⎜ 0 1 𝑎23 ⎟
⎜ ⎟
⎝ 0 0 1 ⎠
Similarly, a matrix whose superdiagonal entries are zero is called lower tri-
angular. A lower triangular matrix with ones along the diagonal is called unit
lower triangular. e following 3 × 3 matrix is lower triangular.
⎛ 𝑎11 0 0 ⎞
⎜ 𝑎21 𝑎22 0 ⎟
⎜ ⎟
⎝ 𝑎31 𝑎32 𝑎33 ⎠
A matrix whose superdiagonal and subdiagonal entries are zero is a diagonal
matrix, e.g.
⎛ 𝑎11 0 0 ⎞
⎜ 0 𝑎22 0 ⎟
⎜ ⎟
⎝ 0 0 𝑎33 ⎠
A square matrix whose subdiagonal elements are the mirror image of its
superdiagonal elements is referred to as a symmetric matrix. More formally, a
symmetric matrix 𝐀 has the property 𝑎𝑖𝑗 = 𝑎𝑗𝑖 . A trivial example of a symmetric
matrix is a diagonal matrix. e general case of a 3×3 symmetric matrix follows.
2 Matrix Algebra
e set of square matrices of dimension 𝑛 form an algebraic entity known as a
ring. By de nition, a ring consists of a set 𝑅 and two operators (addition + and
multiplication ×) such that
7
2.1 Matrix Equality 2 MATRIX ALGEBRA
𝐀=𝐁 (4)
implies
𝐀 = 𝐀𝐓
e transpose of the 2 × 3 matrix
8
2.3 Scalar Multiplication 2 MATRIX ALGEBRA
⎛ 𝑎11 𝑎21 ⎞
⎜ 𝑎12 𝑎22 ⎟
⎜ ⎟
⎝ 𝑎13 𝑎23 ⎠
𝐂=𝛼⋅𝐀 (7)
implies
𝐂=𝐀+𝐁 (9)
implies
𝐀+𝐁=𝐁+𝐀
Matrix addition is also associative.
(𝐀 + 𝐁) + 𝐂 = 𝐀 + (𝐁 + 𝐂)
e additive identity is the zero matrix. e additive inverse of matrix 𝐀 is
denoted by −𝐀 and consists of the element by element negation of a 𝐀, i.e. it’s
the matrix formed when a 𝐀 is multiplied by the scalar −1.
9
2.5 Matrix Multiplication 2 MATRIX ALGEBRA
−𝐀 = −1 ⋅ 𝐀 (11)
𝐂 = 𝐀𝐁 (12)
implies
𝑝
𝑐𝑖𝑗 = (𝑎 + 𝑏𝑘𝑗 ), where 1 ≤ 𝑖 ≤ 𝑚 and 1 ≤ 𝑗 ≤ 𝑛
𝑖𝑘
(13)
𝑘=1
𝐀𝐁 ≠ 𝐁𝐀
As a consequence, the following terminology is sometimes used. Considering
the matrix product
𝐀𝐁
e left multiplicand 𝐀 is said to premultiply the matrix 𝐁. e right multipli-
cand 𝐁 is said to postmultiply the matrix 𝐀.
Matrix multiplication distributes over matrix addition
𝐀 (𝐁𝐂) = (𝐀𝐁) 𝐂
e transpose of a matrix product is the product of the factors in reverse
order, i.e.
10
2.6 Inverse of a Matrix 2 MATRIX ALGEBRA
(𝐀𝐁𝐂)𝐓 = 𝐂𝐓 𝐁𝐓 𝐀𝐓 (14)
e set of square matrices has a multiplicative identity which is denoted by
𝐈. e identity is a diagonal matrix with ones along the diagonal
1 where 𝑖 = 𝑗
𝑎𝑖𝑗 = (15)
0 where 𝑖 ≠ 𝑗
e 3 × 3 multiplicative identity is
⎛ 1 0 0 ⎞
𝐈=⎜ 0 1 0 ⎟
⎜ ⎟
⎝ 0 0 1 ⎠
𝐀𝐁 = 𝐈 (16)
then 𝐁 is a right inverse of 𝐀. Similarly, if 𝐂 is an 𝑛 × 𝑛 matrix such that
𝐂𝐀 = 𝐈 (17)
then 𝐂 is a left inverse of 𝐀. When both Equation 16 and Equation 17 hold
𝐀𝐁 = 𝐂𝐀 = 𝐈 (18)
then 𝐁 = 𝐂 and 𝐁 is the two-sided inverse of 𝐀.
e two-sided inverse of 𝐀 will be referred to as its multiplicative inverse or
simply its inverse. If the inverse of 𝐀 exists, it is unique and denoted by 𝐀−𝟏 .
𝐀−𝟏 exists if and only if 𝐀 is square and nonsingular. A square 𝑛 × 𝑛 matrix is
singular when its rank is less than 𝑛, i.e. two or more of its columns (or rows) are
linearly dependent. e rank of a matrix is examined more closely in Section
2.7 of this document.
A few additional facts about inverses. If 𝐀 is invertible, so is 𝐀−𝟏 and
−𝟏 −𝟏
𝐀 =𝐀 (19)
If 𝐀 and 𝐁 are invertible, so is 𝐀𝐁 and
11
2.7 Rank of a Matrix 2 MATRIX ALGEBRA
𝛼1 𝐚𝟏 + 𝛼2 𝐚𝟐 + ... + 𝛼𝑛 𝐚𝐧 = 𝟎 (23)
is the set
𝛼1 = 𝛼2 = ... = 𝛼𝑛 = 0
For a more concrete example, consider the following matrix.
⎛ 0 1 1 2 ⎞
𝐀=⎜ 1 2 3 4 ⎟
⎜ ⎟
⎝ 2 0 2 0 ⎠
e rank of 𝐀 is two, since its third and fourth columns are linear combinations
of its rst two columns, i.e.
𝐚𝟑 = 𝐚 𝟏 + 𝐚 𝟐
𝐚𝟒 = 2𝐚𝟐
If 𝐀 is an 𝑚 × 𝑛 matrix, it can be shown
12
2.8 Similarity Transformations 2 MATRIX ALGEBRA
𝐁 = 𝐏𝐀𝐏−𝟏 (27)
Every matrix is similar to itself with 𝐏 = 𝐈. e only similarity transformation
that holds for the identity matrix or the zero matrix is this trivial one.
Similarity is a symmetric relation. If 𝐀 ∼ 𝐁, then 𝐁 ∼ 𝐀. erefore, pre-
multiplying Equation 27 by 𝐏−𝟏 and postmultiplying it by 𝐏 yields
𝐀 = 𝐏−𝟏 𝐁𝐏 (28)
Similarity is also a transitive relation. If 𝐀 ∼ 𝐁 and 𝐁 ∼ 𝐂, then 𝐀 ∼ 𝐂.
Since similarity is re exive, symmetric, and transitive, it is an equivalence
relation. A common example of a similarity transformation in linear algebra is
changing the basis of a vector space.
𝐀𝟏𝟏 𝐀𝟏𝟐
𝐀= (29)
𝐀𝟐𝟏 𝐀𝟐𝟐
If 𝐀𝟏𝟏 (𝑘 × 𝑘) and 𝐀𝟐𝟐 (𝑝 × 𝑝) are square matrices, then 𝐀𝟏𝟐 has dimensions 𝑘 × 𝑝
and 𝐀𝟐𝟏 has dimensions 𝑝 × 𝑘.
e transpose of 𝐀 is
𝐀𝐓𝟏𝟏 𝐀𝐓𝟏𝟐
𝐀𝐓 = (30)
𝐀𝐓𝟐𝟏 𝐀𝐓𝟐𝟐
If 𝐀 is invertible, its inverse is
𝐁𝟏𝟏 𝐁𝟏𝟐
𝐀−𝟏 = (31)
𝐁𝟐𝟏 𝐁𝟐𝟐
where
13
3 LINEAR SYSTEMS
−𝟏
𝐁𝟏𝟏 = (𝐀𝟏𝟏 − 𝐀𝟏𝟐 𝐀−𝟏
𝟐𝟐 𝐀𝟐𝟏 )
𝐁𝟏𝟐 = −𝐀−𝟏
𝟏𝟏 𝐀𝟏𝟐 𝐁𝟐𝟐 (32)
𝐁𝟐𝟏 = −𝐀−𝟏
𝟐𝟐 𝐀𝟐𝟏 𝐁𝟏𝟏
−𝟏
𝐁𝟐𝟐 = (𝐀𝟐𝟐 − 𝐀𝟐𝟏 𝐀−𝟏
𝟏𝟏 𝐀𝟏𝟐 )
Alternately,
3 Linear Systems
Consider the 𝑚 × 𝑛 system of linear equations
𝐀𝐱 = 𝐛 (35)
If 𝑚 = 𝑛 and 𝐀 is not singular, Equation 35 possesses a unique solution and is
referred to as a fully determined system of equations. When 𝑚 > 𝑛 (or 𝑚 = 𝑛
and 𝐀 is singular), Equation 35 is an underdetermined system of equations.
Otherwise, 𝑚 < 𝑛 and Equation 35 is overdetermined system of equations.
14
3.1 Solving Fully Determined Systems 3 LINEAR SYSTEMS
𝐱 = 𝐀−𝟏 𝐛 (36)
since 𝐀−𝟏 𝐀 is by de nition the multiplicative identity.
e solution algorithm suggested by Equation 36 is
1. Invert matrix 𝐀.
2. Premultiply the vector 𝐛 by 𝐀−𝟏 .
𝐋𝐔 = 𝐀 (37)
Substituting Equation 37 into Equation 35 yields
15
3.2 Solving Underdetermined Systems 3 LINEAR SYSTEMS
(𝐋𝐔) 𝐱 = 𝐛 (38)
Associating the factors in Equation 38 yields
𝐋 (𝐔𝐱) = 𝐛 (39)
Recalling that efficient procedures exist for solving triangular systems (i.e. for-
ward substitution for lower triangular systems and backward substitution for
upper triangular systems), Equation 39 suggests an algorithm for solving Equa-
tion 35. De ne a vector 𝐲 such that
𝐲 = 𝐔𝐱 (40)
Substituting Equation 40 into Equation 39 yields
𝐋𝐲 = 𝐛 (41)
Since 𝐛 is known and 𝐋 is lower triangular, Equation 41 can be solved for 𝐲
by forward substitution. Once 𝐲 is known, Equation 40 can be solved for 𝐱 by
back substitution.
In summary, the preferred algorithm for solving for a nonsingular 𝑛 × 𝑛
system of linear equations is
1. Compute an 𝐋𝐔 decomposition of 𝐀.
2. Solve Equation 41 for 𝐲 by forward substitution.
3. Solve Equation 40 for 𝐱 by back substitution.
𝑞=𝑛−𝑟
and
𝑟 = rank (𝐀)
e value 𝑞 is referred to as the nullity of matrix 𝐀. e 𝑞 linearly dependent
equations in 𝐀 are the null space of 𝐀.
16
3.3 Solving Overdetermined Systems 3 LINEAR SYSTEMS
2𝑛3 + 𝑂 𝑛2
oating point operations. Computing the LU decomposition of 𝐀 requires
2 3 1 2 1
𝑛 + 𝑛 + 𝑛
3 2 6
or
2 3
𝑛 + 𝑂 𝑛2
3
oating point operations. Computing 𝐱 from the factorization requires
2𝑛2 + 𝑛
17
4 LU DECOMPOSITION
or
2𝑛2 + 𝑂 (𝑛)
oating point operations (which is equivalent to computing the product 𝐀−𝟏 𝐛).
erefore, solving a linear system of equations by matrix inversion requires ap-
proximately three times the amount of work as a solution via LU decomposition.
When 𝐀 is a sparse matrix, the computational discrepancy between the two
methods becomes even more overwhelming. e reason is straightforward. In
general,
4 LU Decomposition
ere are many algorithms for computing the LU decomposition of the matrix
𝐀. All algorithms derive a matrix 𝐋 and a matrix 𝐔 that satisfy equation Equa-
tion 37. Most algorithms also permit 𝐋 and 𝐔 to occupy the same amount of
space as 𝐀. is implies that either 𝐋 or 𝐔 is computed as a unit triangular
matrix so that explicit storage is not required for its diagonal (which is all ones).
ere are two basic approaches to arriving at an LU decomposition:
Discussions of the subject by Fox (1964) [1], Golub and Van Loan (1983) [2],
Duff, et al.(1986) [3], and Press, et al.(1988) [4] are complementary in many
respects. Taken as a group, these works provide a good sense of perspective
concerning the problem.
18
4.1 Gaussian Elimination 4 LU DECOMPOSITION
𝑎(𝑘)
𝑎(𝑘+1)
𝑖𝑗 = 𝑎(𝑘)
𝑖𝑗 − 𝑖𝑘
𝑎(𝑘) , where 𝑖, 𝑗 >𝑘 (43)
𝑎 𝑖𝑗
(𝑘)
𝑘𝑘
e notation 𝑎(𝑘) 𝑡ℎ
𝑖𝑗 means the value of 𝑎𝑖𝑗 produced during the 𝑘 stage of the
𝑎(𝑘)
elimination procedure. In Equation 43, the term 𝑖𝑘
(sometimes referred to as a
𝑎(𝑘)
𝑘𝑘
multiplier) captures the crux of the elimination process. It describes the effect of
eliminating element 𝑎𝑖𝑘 on the other entries in row 𝑖 during the 𝑘𝑡ℎ stage of the
elimination. In fact, these multipliers are the elements of the lower triangular
matrix 𝐋, i.e.
𝑎(𝑘)
𝑖𝑘
𝑙𝑖𝑘 = (44)
𝑎(𝑘)
𝑘𝑘
Algorithm 1 implements Equation 43 and Equation 44 and computes the LU
decomposition of an 𝑚 × 𝑛 matrix 𝐀 (it is based on Algorithm 4.2-1 of Golub
and Van Loan (1983) [2]).
e algorithm overwrites 𝑎𝑖𝑗 with 𝑙𝑖𝑗 when 𝑖 > 𝑗 . Otherwise, 𝑎𝑖𝑗 is over-
written by 𝑢𝑖𝑗 . e algorithm creates a matrix 𝐔 that is upper triangular and a
matrix 𝐋 that is unit lower triangular. Note that a working vector 𝐰 of length 𝑛
is required by the algorithm.
19
4.2 Doolittle’s LU Factorization 4 LU DECOMPOSITION
Algorithm 1: LU Decomposition
f o r 𝑘 = 1, ⋯ , min(𝑚 − 1, 𝑛)
f o r 𝑗 = 𝑘 + 1, ⋯ , 𝑛
𝑤𝑗 = 𝑎𝑘𝑗
f o r 𝑖 = 𝑘 + 1, ⋯ , 𝑚
𝑎𝑖𝑘
𝛼=
𝑎𝑘𝑘
𝑎𝑖𝑘 = 𝛼
f o r 𝑗 = 𝑘 + 1, ⋯ , 𝑛
𝑎𝑖𝑗 = 𝑎𝑖𝑗 − 𝛼𝑤𝑗
20
4.2 Doolittle’s LU Factorization 4 LU DECOMPOSITION
Prior Iterations
Iteration i
Subsequent Iterations
21
4.3 Crout’s LU Factorization 4 LU DECOMPOSITION
is advantage is based on the fact the product of two single precision oating
point numbers is always computed with double precision arithmetic (at least in
the C programming language). Because of this, the product 𝑎𝑖𝑝 𝑎𝑝𝑗 suffers no loss
of precision. If the product is accumulated in a double precision variable 𝛼 , there
is no loss of precision during the entire inner product calculation. erefore,
one double precision variable can preserve the numerical integrity of the inner
product.
Recalling the partial sum accumulation loop of the elimination-based pro-
cedure:
and
𝑖−1
𝑎𝑖𝑗 − 𝑙𝑖𝑝 𝑢𝑝𝑗
𝑝=1
𝑢𝑖𝑗 = , where 𝑖 < 𝑗 (50)
𝑙𝑖𝑖
Algorithm 3 implements Crout’s method. Calculations are sequenced to com-
pute one column of 𝐋 followed by the corresponding row of 𝐔 until 𝐀 is ex-
hausted.
Figure 2 depicts the computational sequence associated with Crout’s method.
You should observe that Crout’s method, like Doolittle’s, exhibits inner
product accumulation.
A good comparison of the various compact factorization schemes is found
in Duff, et al.(1986) [3].
22
4.3 Crout’s LU Factorization 4 LU DECOMPOSITION
Prior Iterations
l(j+1)j
Prior Iterations
l(j+2)j
...
l(n-1)j
lnj Subsequent
Iterations
Iteration j
23
4.4 LDU Factorization 4 LU DECOMPOSITION
𝐋𝐃𝐔 = 𝐀 (51)
where 𝐋 is unit upper triangular, 𝐃 is diagonal, and 𝐔 is unit lower triangular.
It should be obvious that the storage requirements of LDU decompositions and
LU decompositions are the same.
A procedure proposed by Tinney and Walker (1967) [6] provides a concrete
example of an LDU decomposition that is based on Gaussian elimination. One
row of the subdiagonal portion of 𝐀 is eliminated at each stage of the compu-
tation. Tinney refers to the LDU decomposition as a “table of factors”. He
constructs the factorization as follows:
24
4.5 Numerical Instability During Factorization4 LU DECOMPOSITION
Stage 1
Updated
Stage 2
a11(1) a12(1) ... a1n(1)
Updated
(k) th
Note: aij is the k stage partial sum of aij.
25
4.6 Pivoting Strategies for Numerical Stability 4 LU DECOMPOSITION
• At the 𝑘𝑡ℎ stage of the computation, choose the largest remaining element
in 𝐀 as the pivot. If pivoting has proceeded along the diagonal in stages
1 through 𝑘 − 1, this implies the next pivot should be the largest element
𝑎(𝑘−1)
𝑖𝑗 where 𝑘 ≤ 𝑖 ≤ 𝑛 and 𝑘 ≤ 𝑗 ≤ 𝑛. is strategy is referred to as
complete pivoting.
• At the 𝑘𝑡ℎ stage of the computation, select the largest element in column
𝑘 as the pivot. is strategy is referred to as partial pivoting.
26
4.8 Partial Pivoting 4 LU DECOMPOSITION
𝐋𝐔 = 𝐏𝐀 (54)
where 𝐏 is a permutation matrix that is derived as follows:
1. 𝐏 is initialized to 𝐈.
2. Each row interchange that occurs during the decomposition of 𝐀 causes
a corresponding row swap in 𝐏.
𝐀𝐱 = 𝐛
and premultiplying both sides by 𝐏
𝐏𝐀𝐱 = 𝐏𝐛
Using Equation 54 to substitute for 𝐏𝐀 yields
𝐋𝐔𝐱 = 𝐏𝐛 (55)
Following the same train of logic used to derive equations Equation 40 and
Equation 41 implies that a solution for 𝐀 can be achieved by the sequential
solution of two triangular systems.
27
4.9 Complete Pivoting 4 LU DECOMPOSITION
𝐲 = 𝐏𝐛 (56)
𝐋𝐜 = 𝐲
𝐔𝐱 = 𝐜
𝐋𝐔 = 𝐏𝐀𝐐 (57)
where 𝐏 is a row permutation matrix and 𝐐 is a column permutation matrix.
𝐐 is derived from column interchanges in the same way 𝐏 is derived from row
interchanges.
If 𝐀 and its factors are related according to Equation 57, then Equation 35
can still be solved for 𝐀 by the sequential solution of two triangular systems.
𝐲 = 𝐏𝐛 (58)
𝐋𝐜 = 𝐲 (59)
𝐔𝐳 = 𝐜 (60)
𝐱 = 𝐐𝐳 (61)
𝐐 = 𝐏𝐓
28
4.10 Computational Complexity5of Pivoting
SOLVING TRIANGULAR SYSTEMS
𝑛2 + 𝑛
2
oating point comparisons.
29
5.1 Forward Substitution 5 SOLVING TRIANGULAR SYSTEMS
30
5.3 Outer Product Formulation 5 SOLVING TRIANGULAR SYSTEMS
31
6 FACTOR UPDATE
6 Factor Update
If the LU decomposition of the matrix 𝐀 exists and the factorization of a related
matrix
𝐀 = 𝐀 + 𝚫𝐀 (64)
is needed, it is sometimes advantageous to compute the factorization of 𝐀 by
modifying the factors of 𝐀 rather than explicitly decomposing 𝐀 . Implemen-
tations of this factor update operation should have the following properties:
• Arithmetic is minimized,
• Numerical stability is maintained, and
• Sparsity is preserved.
𝐀 = 𝐀 + 𝛼𝐲𝐳𝐓 (65)
where 𝛼 is a scalar and the vectors 𝐲 and 𝐳𝐓 are dimensionally correct. e
terminology comes from the observation that the product 𝛼𝐲𝐳𝐓 is a matrix whose
rank is one.
Computationally, a rank one factor update to a dense matrix is an 𝑂 𝑛2
operation. Recall that decomposing a matrix from scratch is 𝑂 𝑛3 .
32
6.2 LU Factor Update 6 FACTOR UPDATE
𝑧𝑗 = 𝑧𝑗 − 𝑞𝑢𝑖𝑗 (67)
If 𝐔 is upper triangular, the statement becomes
𝑢𝑖𝑗
𝑧 𝑗 = 𝑧𝑗 − 𝑝 (68)
𝛿
where 𝛿 is the value of 𝑢𝑖𝑖 before it was changed during stage 𝑖 of the procedure.
Along the same lines, the factor update statement
33
6.3 Additional Considerations 6 FACTOR UPDATE
𝑢𝑖𝑗
𝑢𝑖𝑗 = 𝑢𝑖𝑖 ( + 𝛽2 𝑧𝑗 ) (71)
𝛿
Taking these observations into consideration and pulling operations on con-
stants out of the inner loop, Algorithm 9 updates 𝐔 based on a rank one change
to 𝐀.
If 𝐔 is unit upper triangular and 𝐋 is lower triangular, a similar algorithm is
derived from the observation that 𝑙𝑖𝑗 of 𝐋 and 𝑙𝑖𝑗 , 𝑑𝑖 of the 𝐋𝐃𝐔 decomposition
are related as follows.
34
7 SYMMETRIC MATRICES
Bennett’s algorithm has proven to be stable for many physical problems with rea-
sonable values of 𝛼 , 𝐲, and 𝐳. e algorithm rarely exhibits instability when it is
applied to diagonally dominant matrices where pivoting is not required. Gill,
et. al. (1974) [10] describe alternate algorithms for situations where stability
problems arise.
Hager (1989) [11] provides a good overview of approaches to the problem of
updating the inverse of a matrix and describes practical areas in which the prob-
lem arises. Chan and Brandwajn (1986) [12] examine applications in network
analysis.
7 Symmetric Matrices
Recall that an 𝑛 × 𝑛 symmetric matrix 𝐀 is its own transpose
𝐀 = 𝐀𝐓
is being the case, the elements of 𝐀 are described by the following relationship
𝐋 = 𝐔𝐓 (73)
and
𝐔 = 𝐋𝐓 (74)
For this reason, the 𝐋𝐃𝐔 decomposition of a symmetric matrix is sometimes
referred to as an 𝐋𝐃𝐋𝐓 decomposition. e elements 𝐋 and 𝐔 of the LDU
decomposition of a symmetric matrix are related as follows.
35
7.2 LU Decomposition of Symmetric Matrices
7 SYMMETRIC MATRICES
document, they will prove useful in deriving symmetric variants of the algo-
rithms discussed in Sections 4 and 5.
In other words, the symmetric factorization algorithms discussed in this doc-
ument assume an 𝐋𝐔 decomposition exists (or is to be computed) such that
is implies that algorithms which deal with an explicit set of lower trian-
gular factors, call them 𝐋, will associate the factors of an implicit 𝐋𝐃𝐔 decom-
position as follows
𝐋 = 𝐋𝐃 (76)
or
𝐋 = 𝐋𝐃−𝟏
Substituting for 𝐋 based on Equation 73 yields
𝐔𝐓 = 𝐋𝐃−𝟏 (77)
Recalling that the inverse of a diagonal matrix is the arithmetic inverse of each
element and taking the product yields
𝑙𝑖𝑗
𝑢𝑗𝑖 =
𝑑𝑖𝑖
Since 𝑑𝑖𝑖 = 𝑙𝑖𝑖 ,
𝑙𝑗𝑖
𝑢𝑖𝑗 = (78)
𝑙𝑗𝑗
In a similar vein, algorithms that deal with an explicit set of upper triangular
factors, call them 𝐔, will associate the factors of an 𝐋𝐃𝐔 decomposition as fol-
lows.
𝐔 = 𝐃𝐔 (79)
is association yields the following relationship between the explicit factors 𝐔
and implicit factors 𝐋.
36
7.3 Symmetric Matrix Data Structures 7 SYMMETRIC MATRICES
𝑢𝑗𝑖
𝑙𝑖𝑗 = (80)
𝑢𝑗𝑗
37
7.4 Doolittle’s Method for Symmetric Matrices
7 SYMMETRIC MATRICES
You will observe that the dimension of 𝐀 does not enter the indexing calculation
when its lower triangular portion is retained.
e indexing equations are implemented most efficiently by replacing divi-
sion by two with a right shift.
38
7.6 Forward Substitution for Symmetric Systems
7 SYMMETRIC MATRICES
39
7.6 Forward Substitution for Symmetric Systems
7 SYMMETRIC MATRICES
40
7.7 Backward Substitution for Symmetric Systems
7 SYMMETRIC MATRICES
erefore, the initial division by 𝑙𝑖𝑖 is omitted and the division by 𝑢𝑖𝑖 is pulled
out of the 𝑘 loop. e outer product formulation of forward substitution where
𝐔 is stored in a linear array with zero based indexing is realized by Algorithm
17.
41
7.8 Symmetric Factor Update 7 SYMMETRIC MATRICES
to making sure that implicit data (i.e. the portion of the symmetric factoriza-
tion that is not not physically stored) is correctly derived from the explicitly
stored data. See Section 7.2 for a discussion of implicit data in symmetric LU
decompositions.
42
7.8 Symmetric Factor Update 7 SYMMETRIC MATRICES
𝐀 = 𝐀 + 𝚫𝐀
is desired, factor update is often the procedure of choice.
Section 6 examines factor update techniques for dense, asymmetric matrices.
e current section examines techniques that exploit computational efficiencies
introduced by symmetry. Symmetry reduces the work required to update the
factorization of 𝐀 by half, just as it reduces the work required to decompose 𝐀
in the rst place.
More speci cally, the current section examines procedures for updating the
factors of 𝐀 following a symmetric rank one modi cation
𝐀 = 𝐀 + 𝛼𝐲𝐲𝐓
where 𝛼 is a scalar and 𝐲 is an 𝑛 vector.
43
7.8 Symmetric Factor Update 7 SYMMETRIC MATRICES
44
8 SPARSE MATRICES
probability of leading zeros, testing 𝑦𝑖 for zero at the beginning of the loop might
save a lot of work. However, you must remember to suspend the test as soon as
the rst nonzero value of 𝑦𝑖 is encountered.
For a fuller discussion of the derivation and implementation of LU factor
update, see Section 6.2.
8 Sparse Matrices
e preceding sections examined dense matrix algorithms for solving systems
of linear equations. It was seen that signi cant savings in storage and compu-
tation is achieved by exploiting the structure of symmetric matrices. An even
more dramatic performance gain is possible by exploiting the sparsity intrin-
sic to many classes of large systems. Sparse matrix algorithms are based on the
simple concept of avoiding the unnecessary storage of zeros and unnecessary
arithmetic associated with zeros (such as multiplication by zero or addition of
zero). Recognizing and taking advantage of sparsity often permits the solution
of problems that are otherwise computationally intractable. Practical examples
provided by Tinney and Hart (1967) [13] suggest that in the analysis of large
power system networks the use of sparse matrix algorithms makes both the stor-
age and computational requirements approximately linear with respect to the
size of the network. In other words, data storage is reduced from an 𝑂 𝑛2
problem to an 𝑂 (𝑛) problem and computational complexity diminishes from
𝑂 𝑛3 to 𝑂 (𝑛).
45
8.1 Sparse Matrix Methodology 8 SPARSE MATRICES
46
8.2 Abstract Data Types for Sparse Matrices 8 SPARSE MATRICES
are speci ed, but the actual data structures used to implement them are left un-
de ned. Any data structure that efficiently satis es the constraints imposed in
this section is suited for the job.
All signals emitted by the operators de ned in this section are used to navi-
gate through data, not to indicate errors. Error processing is intentionally omit-
ted from the algorithms appearing in this document. e intent is to avoid
clutter that obscures the nature of the algorithms.
• Insert adds an arbitrary element 𝑎𝑖𝑗 to 𝐀. If 𝑎𝑖𝑗 does not already exist,
insert signals a successful insertion.
e algorithms assume that operations that read the data structure (get and
scan) make the designated element 𝑎𝑖𝑗 of 𝐀 available in a buffer (this buffer is
usually denoted by the symbol 𝑎). Operations that update 𝑎𝑖𝑗 (insert and
put) do so based on the current contents of the communication buffer 𝑎.
Section 9.1 examines one possible realization of the sparse matrix data type.
47
8.2 Abstract Data Types for Sparse Matrices 8 SPARSE MATRICES
of vertex labels (𝑖, 𝑗). Descriptive information is usually associated with each
edge.
Since both adjacency lists and sparse matrices represent sparse networks, it
should come as no surprise that they require a similar set of operations. More
speci cally, the following operations are supported on an adjacency list 𝐴:
e algorithms assume that read operations (get and scan) make edge
information available in a buffer (this buffer is usually denoted by the symbol
𝑎). Update operations (insert and put) modify the description of an edge
based on the current contents of the communication buffer.
Implementation of adjacency lists is examined in detail in Graph Algo-
rithms1 .
48
8.2 Abstract Data Types for Sparse Matrices 8 SPARSE MATRICES
8.2.4 List
A simple list 𝐿 is an ordered set of elements. If the set {𝑙1 , ⋯ , 𝑙𝑖 , 𝑙𝑖+1 , ⋯ , 𝑙𝑛 }
represents 𝐿, then the list contains 𝑛 elements. Element 𝑙1 is the rst item on
the list and 𝑙𝑛 is the last item on the list. Element 𝑙𝑖 precedes 𝑙𝑖+1 and element
𝑙𝑖+1 follows 𝑙𝑖 . Element 𝑙𝑖 is at position 𝑖 in 𝐿. Descriptive information may
accompany each item on a list. Lists associated with matrix algorithms support
the following operations:
• Find looks for an element on the list and returns its position. If the
element is not a member of the list, 𝑒𝑜𝑙 is returned.
• First returns the position of the rst item on the list. When the list is
empty, 𝑒𝑜𝑙 is returned.
• Next returns position 𝑖 + 1 on the list if position 𝑖 is provided. If 𝑙𝑖 is the
last item on the list, 𝑒𝑜𝑙 is returned.
• Prev returns position 𝑖 − 1 on the list if position 𝑖 is provided. If 𝑖 is one,
𝑒𝑜𝑙 is returned.
A linked list refers to a list implementation that does not require its mem-
bers to reside in contiguous storage locations. In this environment, an efficient
implementation of the prev operator dictates the use of a doubly linked list.
2
https://vismor.com/documents/network_analysis/graph_algorithms/
49
8.3 Pivoting To Preserve Sparsity 8 SPARSE MATRICES
8.2.5 Mapping
A mapping 𝜇 relates elements of its domain 𝑑 to elements of its range 𝑟 as follows.
𝜇(𝑑) = 𝑟
A mapping resides in a data structure that supports two operations:
8.2.6 Vector
For simplicity of exposition, a full vector is represented as a linear array. How-
ever, any data structure that lets you retrieve and update an arbitrary element 𝑏𝑖
of a vector 𝐛 based upon its index 𝑖 will suffice.
50
8.3 Pivoting To Preserve Sparsity 8 SPARSE MATRICES
|𝑎(𝑘) (𝑘)
𝑖𝑗 | ≥ 𝑢 max |𝑎𝑖𝑗 | (91)
𝑙≥𝑘
51
8.4 Symbolic Factorization of Sparse Matrices 8 SPARSE MATRICES
𝐋𝐔 = 𝐏𝐀𝐐 (92)
where 𝐏 and 𝐐 are row and column permutations that re ect the pivot strategy
associated with the factorization process.
3
https://vismor.com/documents/network_analysis/graph_algorithms/
52
8.4 Symbolic Factorization of Sparse Matrices 8 SPARSE MATRICES
53
8.4 Symbolic Factorization of Sparse Matrices 8 SPARSE MATRICES
54
8.5 Creating 𝐏𝐀𝐏𝐓 from a Symbolic Factorization 8 SPARSE MATRICES
𝐏𝐀𝐏𝐓
where
It is assumed that 𝐴 has been expanded to accommodate ll-ups that will, occur
during 𝐋𝐔 decomposition and that the permutation matrix 𝐏 is de ned by
See Section 8.4.1 for details concerning creation of the augmented adjacency
list 𝐴 and the permutation matrix 𝐏, i.e. the minimum degree traversal 𝐿 and
vertex label mapping 𝜓 .
It is also assumed that both the adjacency list 𝐴 and the matrix 𝐏𝐀𝐏𝐓 are
maintained in sparse data structures supporting the scan and insert oper-
ators. Communication with the data structures is maintained through buffers
𝑎 and 𝑝𝑎𝑝 in the normal manner. It is further assumed that procedure make
creates element 𝑎𝑖𝑗 of 𝐀 when its row and column indices, 𝑖 and 𝑗 , are speci ed.
Algorithm 23 constructs a full matrix 𝐏𝐀𝐏𝐓 based on these assumptions.
Zero valued entries are created for elements that will ll up during 𝐋𝐔 decom-
position.
Algorithm 24 constructs the symmetric matrix 𝐏𝐀𝐏𝐓 based on these as-
sumptions.
55
8.5 Creating 𝐏𝐀𝐏𝐓 from a Symbolic Factorization 8 SPARSE MATRICES
56
8.6 Numeric Factorization of Sparse Matrices 8 SPARSE MATRICES
𝐋𝐔 = 𝐏𝐀𝐏𝐓
Algorithms discussed in the current section act on a sparse 𝑛 × 𝑛 matrix 𝐀. ey
assume that
• 𝐀 already re ects the pivot strategy de ned by 𝐏 and 𝐏𝐓 , i.e. the algorithms
pivot down the diagonal.
• 𝐀 has zero-valued entries at ll-up locations.
• 𝐀 is maintained in a sparse data structure supporting the get, scan, and
put operators. Communication with the data structure is maintained
through the buffer 𝑎 in the normal manner.
See Section 8.5 for details concerning the creation of a pivot ordered, ll-up
augmented 𝐀 matrix.
Algorithm 25 uses Doolittle’s method (see Section 4.2 for more informa-
tion) to compute the 𝐋𝐔 decomposition of a sparse matrix 𝐀. e algorithm
overwrites 𝐀 with 𝐋𝐔.
Algorithm 26 uses Doolittle’s method to compute 𝐔, the upper triangular
factors, of a symmetric sparse matrix 𝐀. It is assumed that 𝐀 is initially stored
as an upper triangular matrix.
e algorithm overwrites 𝐀 with 𝐔. e vector 𝐰 is used to construct the
nonzero entries of each column from 𝐔. e vector 𝐜 contains cursors to the
row in 𝐋 with which the entries of 𝐀 are associated, e.g. if 𝑤𝑘 contains 𝑙𝑗𝑖 then
𝑐𝑘 is 𝑗 .
57
8.7 Solving Sparse Linear Systems 8 SPARSE MATRICES
58
8.7 Solving Sparse Linear Systems 8 SPARSE MATRICES
𝐲 = 𝐏𝐛
𝐋𝐜 = 𝐲
𝐔𝐳 = 𝐜
𝐱 = 𝐐𝐳
For example, Section 8.6 describes algorithms that create numeric factoriza-
tions satisfying the rst two of these assumptions. Section 8.4.1 describes an
algorithm for obtaining the row and column permutations corresponding to a
minimum degree pivot strategy.
For simplicity of exposition, it is assumed that the vectors 𝐛, 𝐜, 𝐱, 𝐲, and 𝐳
are stored in linear arrays. However, any data structure that lets you retrieve and
update an element of a vector based on its index will suffice.
59
8.7 Solving Sparse Linear Systems 8 SPARSE MATRICES
60
8.8 Sparse LU Factor Update 8 SPARSE MATRICES
𝐀 = 𝐀 + 𝛼𝐲𝐳𝐓 (94)
where 𝛼 is a scalar, 𝐲 and 𝐳 are 𝑛 vectors, and 𝐀 has the same sparsity pattern as
𝐀.
e condition on the structure of 𝐀 is not imposed by the factor update
process, but is instead a comment on the utility of factor update in a sparse
61
8.8 Sparse LU Factor Update 8 SPARSE MATRICES
environment. If the modi cation to 𝐀 introduces new elements into the matrix,
the pivot sequence determined during symbolic factorization may no longer
apply. e sparsity degradation introduced by an inappropriate pivot sequence
may outweigh the bene ts gained from the updating the existing factorization.
e performance of factor update algorithms is often enhanced by restricting
pivot operations to the portions of 𝐋 and 𝐔 that are directly effected by the
change in 𝐀. Papers by Tinney, Bradwajn, and Chan (1985) [8] and Chan and
Brandwajn (1986) [12] describe a systematic methodology for determining this
subset of 𝐋𝐔. e rows of 𝐔 and columns in 𝐋 that are changed during factor
update are referred to as the factorization path. e fundamental operation
is to determine the factorization path associated with a vector 𝐲 with just one
nonzero element. Such a vector is called a singleton. Its factorization path is
called a singleton path. If more than one of the elements in 𝐲 are nonzero, the
composite factorization path is simply the union of the singleton paths.
62
9 IMPLEMENTATION NOTES
9 Implementation Notes
is document concludes with a brief discussion of an experimental implemen-
tation of sparse matrix algorithms in a highly cached database environment. It
63
9 IMPLEMENTATION NOTES
64
9.1 Sparse Matrix Representation 9 IMPLEMENTATION NOTES
65
9.2 Database Cache Performance 9 IMPLEMENTATION NOTES
66
9.2 Database Cache Performance 9 IMPLEMENTATION NOTES
Row 1
Row 2
Row 3
Row 4
Row 5
Row 6
... ...
Row n
Asymmetric Matrix
Row 1
Row 2
Row 3
... ...
Row n
Symmetric Matrix
67
9.2 Database Cache Performance 9 IMPLEMENTATION NOTES
68
9.2 Database Cache Performance 9 IMPLEMENTATION NOTES
• e average number of splits for the 𝑛+1𝑠𝑡 insertion into a 10,000 key 31-
61 tree is approximately 0.02933, i.e. the tree will split each time 33.40
items are inserted (on the average).
Splitting increases the constant associated with the growth rate slightly. It
does not increase the growth rate per se.
• e cache was large enough to hold the entire 𝐵 𝑙𝑖𝑛𝑘 tree of relations A,
B, and C. ere are no cache faults to disk in these measurements. e
relation D was too big to t in core. Its next times re ect numerous
cache faults.
• e get operation looked for the same item during each repetition. is
is explains the lack of cache faults while relation D was processed. Once
the path to the item was in core it was never paged out.
69
9.2 Database Cache Performance 9 IMPLEMENTATION NOTES
Repetitions
30k 50k 100k 200k Average
Operation (seconds) (seconds) (seconds) (seconds) (𝜇sec)
𝐑𝐞𝐥𝐚𝐭𝐢𝐨𝐧 𝐀1
Next n/a 12 25 51 255
Get n/a 26 52 103 515
𝐑𝐞𝐥𝐚𝐭𝐢𝐨𝐧 𝐁2
Next 7 12 24 47 235
Get 34 57 114 228 1,140
𝐑𝐞𝐥𝐚𝐭𝐢𝐨𝐧 𝐂3
Next 7 12 24 49 245
Get 65 108 216 433 2,165
𝐑𝐞𝐥𝐚𝐭𝐢𝐨𝐧 𝐃4
Next 48 82 164 n/a 1,640
Cache faults 2,058 3,541 7,095
Get 76 126 252 n/a 2,520
Cache faults 1 1 1
1
112 tuples, 2 byte key.
2
463 tuples, 4 byte key.
3
673 tuples, 22 byte key.
4
10,122 tuples, 22 byte key.
70
9.3 Floating Point Performance 9 IMPLEMENTATION NOTES
1
sum + = a[cursor[i]] * y[cursor[j]]
2
a[ cursor[i]] * = scalar
71
9.3 Floating Point Performance 9 IMPLEMENTATION NOTES
1
Bessel function of the rst kind, order 0.
2
Extrapolated from 30,000 repetitions.
3
Bessel function of the second kind, order 0.
Differences in loop overheads found in Table 2 and Table 3 are accounted for
by the differences in the loop counter implementation described above. e 3
𝜇sec overhead re ects the time required to increment a long integer and monitor
the termination condition (which also involved a long integer comparison). e
1.3 𝜇sec overhead re ects the time required to increment a register and monitor
72
9.4 Auxiliary Store REFERENCES
References
[1] L. Fox, An Introduction to Numerical Linear Algebra, Clarendon Press, Ox-
ford, 1964. 18
[2] G. Golub and C. Van Loan, Matrix Computations, e Johns Hopkins Uni-
versity Press, Baltimore, 1983. 18, 19, 26
[3] I. Duff, A. Erisman, and J. Reid, Direct Methods for Sparse Matrices, Claren-
don Press, Oxford, 1986. 18, 22, 46, 51
[4] W. Press, B. Flannery, S. Teukolsky, and W. Vetterling, Numerical Recipes in
C, Cambridge University Press, Cambridge and New York, 1988. 18, 26
[5] S. Conte and C. de Boor, Elementary Numerical Analysis, McGraw-Hill Book
Company, New York, 1972. 22, 26
[6] W. Tinnney and J. Walker, “Direct solutions of sparse network equations
by optimally ordered triangular factorization”, pp 1801-1809, Proceedings of
the IEEE, Volume 55, No. 11, 1967. 24
[7] A. George and J. Liu, Computer Solutions of Large Sparse Positive De nite
Systems, Prentice-Hall, Engle Wood Cliffs, New Jersey, 1981. 31, 46, 52
[8] W. Tinney, V. Brandwajn, and S. Chan, “Sparse vector methods”, IEEE
Transactions on Power Apparatus and Systems, PAS-104, No. 2, 1985. 31, 62
[9] J Bennett, “Triangular Factors of Modi ed Matrices”, Numerische Mathe-
matik, Volume 7, pp. 217-221, 1965. 34
73
REFERENCES REFERENCES
74