Fem Support
Fem Support
Fem Support
, BRUCE HENDRICKSON
Abstract. We show in this note how support preconditioners can be applied to a class of linear
systems arising from use of the nite element method to solve linear elliptic problems. Our technique
reduces the problem, which is symmetric and positive denite, to a symmetric positive denite
diagonally dominant problem. Signicant theory has already been developed for preconditioners in
the diagonally dominant case. We show that the degradation in the quality of the preconditioner
using our technique is only a small constant factor.
1. Introduction. Finite element discretizations of elliptic PDEs give rise to
large sparse linear systems of equations. A topic of great interest is preconditioners
for iterative solution of such systems. Our contribution in this paper has two parts:
First, we show how SPD matrices of a particular form K = A
T
D
1/2
J
T
J
D
1/2
A can be
well approximated by a symmetric diagonally dominant, M-matrix. Since signicant
theory has been developed for such matrices, we know they can be solved eciently
using preconditioned iterative methods. Second, we show that the stiness matrix K
from a large class of nite elements for elliptic problems has the structure mentioned
above.
The idea of approximating FEM systems by diagonally dominant matrices is not
new; see for example [7]. In fact, our approach is similar to Gustafssons in that we
approximate each element matrix by a diagonally dominant matrix. In contrast to [7],
we are able to rigorously prove bounds on the spectral properties of our approximation
(and thus also the preconditioner). Our analysis uses the support theory framework
described in [3] for analyzing condition numbers and (generalized) eigenvalues for
preconditioned systems.
The support theory framework is briey reviewed in Section 2, and then a new
general-purpose result about support numbers is provided. In order to apply this
result to nite element problems, a factorization of the nite element stiness matrix
K in the form K = A
T
D
1/2
J
T
J
D
1/2
A is obtained, which is the topic of Sections 3
5. This factorization is found in two steps. First, we factor K = A
T
J
T
DJA in
a fairly natural fashion, i.e., in a fashion that has a direct derivation from nite
element principles. Then we refactor J
T
DJ as
D
1/2
J
T
J
D
1/2
in order to apply our
new result. Analysis of the relevant condition numbers is provided in Section 6. Since
our notation is somewhat formidable, we provide an explanatory example of how the
method works for the commonly arising special case of piecewise linear nite elements
in two dimensions in Section 7.
2. Support Theory. For nn symmetric positive semidenite (SPSD) matrices
A, B we dene the support number to be
(A, B) = min
_
t R| x
T
(B A)x 0 for all x R
n
and for all t
_
.
For historical reasons, the letter is used for support numbers. Unfortunately, is
also commonly used to denote singular values. In this paper a with one argument
is a singular value, and with two arguments it is a support number.
Department of Computer Science, 4130 Upson Hall, Cornell University, Ithaca, NY 14853, USA,
vavasis@cs.cornell.edu.
1
When A and B are symmetric positive denite (SPD) then (A, B) =
max
(A, B),
the largest generalized eigenvalue. When B is a preconditioner for A, then the con-
dition number of the preconditioned system is given by (A, B)(B, A). Hence, by
bounding the support numbers (A, B) and (B, A) we can bound the condition
number. A number of algebraic techniques for bounding support numbers were given
in [3]. For symmetric, diagonally dominant matrices, graph embedding techniques
can also be used [6, 2, 12]. In the latter paper [12] it was shown that all symmetric,
diagonally dominant systems can be solved in near-linear time. Our aim in this paper
is to extend this result to FEM systems that are not diagonally dominant.
We will need some results from [3]. One is the triangle inequality,
(K, M) (K,
K)(
K, M), (2.1)
which shows that the the overall support number is bounded by the product of the
support numbers in each step of the approximation K
K M.
A second result is the Symmetric-Product Support Theorem:
Theorem 2.1. Suppose U R
nk
is in the range of V R
np
. Then
(UU
T
, V V
T
) = min
W
W
2
2
subject to V W = U.
We derive a couple corollaries from this theorem that will be useful in our analysis.
We are interested in the case where U = V G for some G.
Corollary 2.2. Let V be as above and suppose G is a matrix with p rows. Then
(V GG
T
V
T
, V V
T
)
max
(G)
2
,
where
max
denotes the largest singular value.
Proof. Let U = V G and apply Theorem 2.1. This gives
(V GG
T
V
T
, V V
T
) G
2
2
=
max
(G)
2
.
Corollary 2.3. Let V be as above and suppose G is a rank-p matrix with p
rows. Then
(V V
T
, V GG
T
V
T
)
min
(G)
2
,
where
min
denotes the smallest singular value.
Proof. Let G
= G
1
since G has full rank. Let
V = V G and thus
V G
= V . Then
(V V
T
, V GG
T
V
T
) = (
V G
V
T
,
V
V
T
)
and the result follows from Corollary 2.2 and the fact that the singular values of G
max
(G)/
min
(G).
The way we will apply this theorem is to let V be a weighted vertex-edge inci-
dence matrix. Then V V
T
is diagonally dominant while V GG
T
V
T
is not (in general).
Therefore, we now have a tool to approximate non-diagonally-dominant matrices.
3. The matrix approximation. Our main matrix factorization result is sum-
marized by the following theorem whose various aspects and whose proof are explained
in upcoming sections.
Theorem 3.1. Suppose K is the nn assembled stiness matrix of the standard
order-p isoparametric nite element method applied to the elliptic boundary value
problem given by (4.1) below. (The entries of this matrix are given by (4.7).) Let d
denote the space dimension, m the number of elements, n the number of unconstrained
nodes (i.e., nodes not on Dirichlet boundaries), and q the number of Gauss points in
the nite element quadrature scheme. Assume the quadrature scheme has positive
weights and is exact for polynomials of degree 2p 2. Let l denote the number of
nodes per element. (For triangles in two dimensions, we have l = (p + 1)(p + 2)/2.
For tetrahedra in three dimensions, we have l = (p + 1)(p + 2)(p + 3)/6.) Then K
may be factored as A
T
J
T
DJA, where A is an (l 1)mn reduced node-arc incidence
matrix, J is a dqm(l 1)m matrix that is well conditioned (see (6.6) and (6.7) for
the bounds) and D is a dqmdqm positive denite diagonal matrix. Further, J
T
DJ
may be refactored as
D
1/2
J
T
J
D
1/2
where
D is a (l 1)m(l 1)m positive denite
diagonal matrix and
J is a dqm (l 1)m matrix that is also well conditioned (see
(6.9) and (6.10)).
This theorem can now be combined with Theorem 2.4 to obtain a good approxi-
mation
K to the stiness matrix K. In particular, we take
K = A
T
DA. Then in the
context of Theorem 2.4, V = A
T
D
1/2
and G =
J
T
. The theorem thus implies that
the condition number of
K with respect to K depends only on the condition number
of
J for which we have good bounds. Then
K can be preconditioned using techniques
in the previous literature. The complete description is given below Section 8.
4. Finite element method. In this section, we explain the boundary value
problem and the nite element method used to solve it as introduced by Theorem 3.1.
The class of problems under consideration consists of nite-element discretizations of
the following second-order elliptic boundary value problem. Find u : R satisfying
(u) = f on ,
u = u
0
on
1
,
u/n = g on
2
.
(4.1)
Here, is a bounded open subset of R
d
(typically d = 2 or d = 3),
1
and
2
form
a partition of , is a given scalar eld on that is positive-valued everywhere
and is sometimes called the conductivity, f : R is a given function called the
forcing function, u
0
is a given function called the Dirichlet boundary condition and g
is another given function called the Neumann boundary condition.
For the rest of this section, we describe the isoparametric nite element method
for solving (4.1). This material is quite standard and is covered by many monographs,
e.g., [9]. Our reason for presenting it is to dene our notation used in the next few
sections.
We assume that is discretized using a mesh T of , which is a nite set of
elements T T that meet each other simplicially. Each T T is the image of a
3
mapping function, that is, an orientation-preserving dieomorphism
T
: T
0
T,
where T
0
is called the reference element. An assumption made for simplicity is that
the reference element T
0
is the standard simplex, e.g., the triangle with vertices (0, 0),
(1, 0), (0, 1) when d = 2 or the tetrahedron with vertices (0, 0, 0), (1, 0, 0), (0, 1, 0),
(0, 0, 1) when d = 3. Other common choices for reference elements would be squares
in 2D or cubes in 3D. We assume further that each
T
is a d-variate polynomial of
degree p or less.
The
T
s are chosen so if T, T
0
such that
T
(F
0
) = F,
T
(F
0
) = F, and there is an ane linear automorphism of T
0
such that (F
0
) = F
0
and such that
T
|
F0
=
T
. Here, |
F0
denotes the restriction of to F
0
. Usually,
T
is chosen so that the element T conforms well to the boundaries and satisfy certain
regularity properties; see, e.g., [11]. Let the union of all T T be denoted
. This
set
is intended to be a close approximation to ; exact agreement is not possible
unless the boundary of can be expressed as the union of images of polynomial maps
of degree p or less.
Let V
p,T
denote the set of functions :
R such that (a) is continuous;
(b) |
T
has the form s
1
T
, where s is a scalar polynomial function of degree p
or less on the reference element T
0
and where T is any member of T ; and (c) is
identically zero on
1
. Here,
1
is the approximation to
1
that is an appropriately
chosen subset of
.
Observe that V
p,T
is a nite-dimensional vector space. Let
V
p,T
be the same as
V
p,T
except that in place of condition (c), we require (c
) |
1
= u
0
where u
0
is an
approximation to the Dirichlet boundary condition u
0
on
1
. The exact denition of
u
0
is immaterial for the results of this paper. Observe that
V
p,T
is a coset of V
p,T
.
The isoparametric nite element method nds an approximate solution u to (4.1)
in the space
V
p,T
. For this purpose, a basis of V
p,T
must be identied. The standard
approach to dening a basis is as follows. In the reference element, position an
evenly-spaced array of l = (p + 1)(p + 2)/2 nodes when d = 2 or of l = (p + 1)(p +
2)(p + 3)/6 nodes when d = 3. These nodes have coordinates of the form (i/p, j/p)
(in two dimensions) or (i/p, j/p, k/p) (in three dimensions) where i, j (and k in 3D)
are nonnegative integers whose sum is at most p. Let these nodes be enumerated
z
1
, . . . , z
l
.
For each T T , let v
T,1
, . . . , v
T,l
denote
T
(z
1
), . . . ,
T
(z
l
). These l points are
the nodes of T listed with their local numbering. (A word about notation: we use
ordinary italic type for scalars and for vectors of length d. We reserve bold type for
long vectors, e.g., vectors of length n.) Because of the assumption of how neighboring
T
s are related, nodes that lie on a simplicial subface of T exactly coincide with
the corresponding subset of nodes of any other element T
1
while w
n+1
, . . . , w
n
are contained in
1
. There is an index mapping
function, the local-to-global numbering map, carrying a pair (T, i) where T T and
i {1, . . . , l} to an index j 1, . . . , n
ij
for i = 1, . . . , n and j = 1, . . . , n
t(x) (x)u(x) dx =
_
t(x)f(x) dx +
_
2
t(x)g(x) dx.
The nite element method discretizes this weak form in the sense that it computes a
u
V
p,T
such that for all t V
p,T
,
_
t(x)
f(x) dx +
_
2
t(x) g(x) dx,
where
f, g are approximations to f and g. Since the weak form is linear in t, it suces
for the weak form to hold only for the basis of V
p,T
. Thus, we seek u
V
p,T
such
that for i = 1, . . . , n:
_
i
(x) (x) u(x) dx =
_
i
(x)
f(x) dx +
_
i
(x) g(x) dx. (4.2)
The next observation is that since u
V
p,T
, it may be written in the form
u = u
BC
+
n
j=1
j
, (4.3)
where u
BC
is a function in
V
p,T
that is equal to u
0
on
1
, is zero at nodes w
1
, . . . , w
n
(i.e., all nodes not on
1
), and is interpolated in the standard way in between nodes,
and
1
, . . . ,
n
are initially unknown scalars. Because of the delta-property of the
i
s,
we have u(w
i
) =
i
for all i = 1, . . . , n. Substituting (4.3) into (4.2) and rearranging
reduces the computation of u to the problem of solving
K = f
where is the vector (
1
, . . . ,
n
)
T
, f is called the load vector and arises from
contributions involving
f, g and u
BC
, and K is the n n matrix whose entries are
dened by
K(i, j) =
_
i
(x) (x)
j
(x) dx. (4.4)
This matrix K is called the assembled stiness matrix of the problem. It is sparse,
symmetric and positive denite. Symmetry is obvious; positive deniteness follows
from a fairly straightforward argument that we omit, and sparsity follows because
K(i, j) is nonzero only if there is an element T that contains both global nodes w
i
and w
j
. (The proof of positive deniteness assumes that
1
contains at least one node
per connected component of
. We make this assumption in our setting, although it
is not required for our method.)
5
Integral (4.4) is dicult to compute directly because evaluating
i
requires eval-
uation of
1
T
, which is a nontrivial matter. Fortunately, this diculty is avoided by
carrying out the integral over the reference domain following a change of variables as
follows:
K(i, j) =
TT
_
T
i
(x) (x)
j
(x) dx
=
TT
_
T0
i
(
T
(z)) (
T
(Z))
x
j
(
T
(z)) det(D
T
(z)) dz (4.5)
where the notation
x
means derivative with respect to the coordinates of T (as
opposed to derivative with respect to z, the coordinates of T
0
). Here, D denotes
dierentiation, hence D
T
(z) is a d d matrix. The integrand of (4.5) is evaluated
using the chain rule for derivatives. Assume w
i
is a node of T (else the above integral
is 0). Let I be the index such that LG(T, I) = i, so that
i
|
T
= N
I
1
T
. Then
i
(
T
(z)) =
x
N
I
(z)
= D
T
(z)
1
z
N
I
(z), (4.6)
and similarly for
J
. Here, D
T
(z)
1
denotes the inverse of a d d matrix (which
exists since
T
is assumed to be dieomorphic). The notation in the previous
formula indicates matrix-vector multiplication.
We assume that the entries of K are not computed exactly but are obtained by
a quadrature rule that we now discuss. Let r
1
, . . . , r
q
be points in the interior of the
reference element T
0
called the Gauss points. (As is common practice, we use this
terminology even if the quadrature rule is not related to Gaussian quadrature.) Let
1
, . . . ,
q
be corresponding Gauss weights. We denote this quadrature rule (i.e., the
set of ordered pairs (r
1
,
1
), . . . , (r
q
,
q
)) by the symbol Q. Then in place of (4.5) we
take
K(i, j) =
TT
q
=1
i
(
T
(r
)) (
T
(r
))
x
j
(
T
(r
)) det(D
T
(r
))
, (4.7)
in which
x
i
(
T
(r
)) is evaluated by substituting z = r
j
(
T
(r
)).
5. Reformulation of the method. In this section we present a reformulation
of the nite element method that leads to the factorization of K on which the pre-
conditioner is based. This factorization is related to one proposed in [14]. Let be
an arbitary vector in R
n
, and let u V
p,T
be dened by
u =
n
i=1
i
.
For an element T, let
U
T
=
l
I=1
LG(T,I)
N
I
. (5.1)
With these denitions, it is clear that u|
T
T
= U
T
.
6
The assembled stiness matrix is dened by (4.7) so that
T
K =
n
i=1
n
j=1
j
K(i, j)
=
n
i=1
n
j=1
TT
q
=1
i
(
T
(r
)) (
T
(r
))
x
j
(
T
(r
)) det(D
T
(r
))
TT
q
=1
n
i=1
i
(
T
(r
)) (
T
(r
))
n
j=1
j
(
T
(r
)) det(D
T
(r
))
TT
q
=1
x
u(
T
(r
)) (
T
(r
))
x
u(
T
(r
)) det(D
T
(r
))
TT
q
=1
x
U
T
(r
) (
T
(r
))
x
U
T
(r
) det(D
T
(r
))
TT
q
i=1
(D
T
(r
)
1
z
U
T
(r
)) (
T
(r
))
(D
T
(r
)
1
z
U
T
(r
)) det(D
T
(r
))
(5.2)
= v
T
Dv. (5.3)
In (5.3), we introduced a dqm dqm diagonal matrix D and a dqm-vector v with
entries dened as follows. First, m = |T |, the number of elements. The rows and
columns of D as well as the entries of v are indexed by ordered triples (T, , e) where
T T , {1, . . . , q}, and e {1, . . . , d}. The ((T, , e), (T, , e)) diagonal entry of D
is taken to be (
T
(r
)) det(D
T
(r
))
2
T
(independent of e). Here
T
is a positive
scalar dened by (6.4) below. Because we want D to be positive denite, we must
impose the assumption mentioned in Theorem 3.1 that for all = 1, . . . , q,
> 0.
There is some loss of generality with this assumption because a few popular nite
element quadrature schemes (but certainly not all) use negative weights [5].
Entries of v are dened as follows: the (T, ) block-entry of v, which is a d-
vector, is D(
T
(r
))
1
U
T
(r
)/
T
. With these denitions of D and v, it is clear
that v
T
Dv is equal to the expression in (5.2).
Next, we write v = RSA, where R, S and A are matrices dened as follows.
First, A is an (l 1)m n matrix whose entries are all 0s, 1s and 1s chosen
according to the following pattern. For each element T T , A has a consecutive
block of l 1 consecutive rows; let the rows of A be indexed by (T, j) for j = 2, . . . , l.
The columns are indexed 1, . . . , n in correspondence with global nodes w
1
, . . . , w
n
.
Row (T, j) has a 1 in the column indexed by LG(T, j) and a 1 in the column
indexed by LG(T, 1). Thus, most rows of A have exactly two nonzero entries. In the
case that LG(T, j) > n (i.e., node v
T,j
lies in
1
), the 1 entry is omitted. Similarly,
if LG(T, 1) > n, then the 1 entry is omitted. For this reason, a few rows of A have
just one nonzero entry or none at all.
Thus, A is a reduced node-arc adjacency matrix of a graph dened on the nodes
of T . Each element gives rise to l 1 arcs in the graph. In particular, for each element
T T, there is an arc joining each of its nodes 2, . . . , l to node 1. Note that the nodes
of
1
are all collapsed into a single supernode, and then the column corresponding to
this supernode is deleted. (In the case that has multiple connected components,
the
1
nodes of each component are distinct supernodes.)
7
The product A yields a vector with (l 1)m entries. It is composed of m blocks
of l 1 entries. The block corresponding to T T , which we denote by
T
, is an
(l 1)-vector of nite dierences of entries of (i.e., entries of the form
i
j
) that
are associated with the l nodes of T.
Next, let S
Q,p
denote the dq (l 1) matrix that carries out the following oper-
ation. Given an (l 1)-vector
T
that represents nite dierences of nodal values
of u assigned to the nodes of T
0
, we let S
Q,p
be the matrix such that S
Q,p
T
is the
concatenation of U
T
(r
1
), . . . , U
T
(r
q
) where U
T
was dened by (5.1). (Because we
have only nite dierences of
i
s, U
T
cannot be recovered exactly from
T
; rather,
U
T
can be recovered only up to a constant additive term. This does not create a
problem since S
Q,p
is dened to produce the gradient of U
T
only, which does not
require knowledge of this missing term.) This matrix S
Q,p
depends on p and Q but
on no other aspect of the problem.
We now impose the following additional assumption mentioned in Theorem 3.1:
the quadrature rule Q is exact for polynomials of degree 2p 2, i.e., if : T
0
R is
a polynomial of degree 2p 2 or less, then
q
=1
(r
=
_
T0
(z) dz.
This assumption is quite reasonable since it is usually required anyway for accurate
solution by nite element analysis: one wants accurate quadrature of
i
j
. With
this assumption in hand, we now have the following lemma.
Lemma 5.1. Assuming Q is exact for polynomials of degree 2p 2, matrix S
Q,p
has rank l 1.
Proof. Let
T
be a vector of nite dierences such that S
Q,p
T
= 0. This
means that the polynomial of degree at most p that interpolates the underlying
vector of
i
s at the nodes of T
0
has an identically zero gradient at all the Gauss
points. Consider the polynomial s = . This polynomial has degree at most
2p 2 and is also identically zero at the Gauss points. Since the quadrature rule
is exact for polynomials of this degree,
_
T0
s(z) dz must be zero. But since s is a
nonnegative polynomial, this means s must be identically zero, i.e., is identically
zero, i.e., is a constant function. But in this case,
T
must be all zeros since its
entries consist of nite dierences of nodal values of . Thus, we have shown that the
only solution to S
Q,p
T
= 0 is
T
= 0, which proves the theorem.
Let us dene S to be the dqm (l 1)m block-diagonal matrix consisting of m
copies of S
Q,p
on the diagonal. In other words, the ((k1)dq+1 : kdq, (k1)(l1)+1 :
k(l 1)) submatrix of S is a copy of S
Q,p
for k = 1, . . . , m, and all other entries of
S are zeros. Then SA is a vector whose entries are indexed by (T, , e) for T T ,
1, . . . , q and e 1, . . . , d. The (T, ) block of entries of SA is a d-vector containing
U
T
(r
).
Recall that the (T, ) block of entries of v contains D
T
(r
)
1
U
T
(r
)/
T
.
Thus, to have the identity v = RSA, we need to dene a dqmdqm block-diagonal
matrix R composed of qm blocks of size d d each, where the diagonal block indexed
by (T, ) is D
T
(r
)
1
/
T
.
Thus, we have derived R, S, A such that RSA = v. Since v
T
Dv =
T
K, this
means
T
A
T
S
T
R
T
DRSA =
T
K. Since this holds for all R
n
, and since a
symmetric matrix C is uniquely determined by the mapping
T
C, this means
that K = A
T
S
T
R
T
DRSA.
8
Let us dene J = RS. Then K = A
T
J
T
DJA. This factorization underlies the
new preconditioning approach. Our factorization is reminiscent of one proposed by
Argyris [1] of the form K =
AP
A
T
, which he calls the natural factorization. In
Argyriss factorization, however, the matrix
A has all +1 and 0 entries and therefore is
not a node-arc incidence matrix. The purpose of Argyriss matrix
A is to assemble the
element stiness matrices, which constitute the blocks of the block-diagonal matrix
P.
Observe that J, a dqm(l 1)m matrix, has the dq (l 1) matrix
J
T
= diag(D
T
(r
1
)
1
, . . . , D
T
(r
q
)
1
)S
Q,p
/
T
in columns and rows indexed by T, and is zero in all other positions.
6. Analysis of the factorization. In this section, we provide estimates for the
norm and condition number of J and of the individual blocks of D.
We start by dening some helpful scalars:
T
= max
zT0
D
T
(z)
1
, (6.1)
T
= max
zT0
D
T
(z), (6.2)
(T ) = max
TT
T
T
. (6.3)
Recall that
T
is used in the factorization described herein, which means that
T
must be computed for each T. In the case p = 1 (linear elements), nding
T
is
quite standard. For higher-order elements, there is no simple method to compute this
quantity, but an upper bound can be derived using the techniques of [13]. It turns
out, however, that an alternative denition
T
= max
=1,...,q
D
T
(r
)
1
(6.4)
is also valid for the analysis that follows, and this denition is much simpler to eval-
uate. The denition of
T
can be similarly altered. Finally, it should be noted that
(T ) 1 since for any z, D
T
(z)
1
D
T
(z) 1. It will be seen shortly that the
condition numbers of both J and
J depend on (T ). Therefore, we tacitly assume
that (T ) is not too large. This is equivalent to requiring that all the elements in
the mesh are well-shaped, i.e., they are not too distorted when compared to the
reference element. This assumption does not imply that the elements are of a uniform
size: a uniform resizing of an element T does not aect the product
T
T
. Note
also by Hadamards inequality that
d
T
det(D
T
(z))
d
T
. (6.5)
Note that
d
T
and
d
T
dier by at most a factor (T )
d
.
We use these scalars to bound the condition numbers of the factors in our factor-
ization. Let
Q,p
=
max
(S
Q,p
)
and
Q,p
=
min
(S
Q,p
)
9
Recall by Lemma 5.1 that both of these are positive.
Lemma 6.1. Matrix J is well-conditioned in the sense that its singular values are
bounded between the following values:
max
(J)
Q,p
, (6.6)
and
min
(J)
Q,p
/(T ). (6.7)
Proof. Since J is block-diagonal, its maximum singular value is the maximum
singular value among any of its blocks and similarly for its minimum singular value.
Let R
T
be the dqdq submatrix of R associated with element T. Then J
T
= R
T
S
Q,p
.
Since in general
max
(AB)
max
(A)
max
(B),
max
(J
T
)
max
(R
T
)
max
(S
Q,p
)
=
max
(diag(D
T
(r
1
)
1
, . . . , D
T
(r
q
)
1
))
Q,p
/
T
= max
=1,...,q
max
(D
T
(r
)
1
)
Q,p
/
T
Q,p
.
Since
min
(AB)
min
(A)
min
(B) for two matrices A, B with full column rank,
min
(J
T
)
min
(R
T
)
min
(S
T
)
=
min
(diag(D
T
(r
1
)
1
, . . . , D
T
(r
q
)
1
))
Q,p
/
T
= min
=1,...,q
min
(D
T
(r
)
1
)
Q,p
/
T
= min
=1,...,q
(1/
max
(D
T
(r
)))
Q,p
/
T
(1/
T
)
Q,p
/
T
Q,p
/(T ).
Recall that the entries of D associated with element T consist of the dq dq
matrix
D
T
=
2
T
diag((
T
(r
1
)) det(D
T
(r
1
))
1
I, . . . , (
T
(r
q
)) det(D
T
(r
q
))
q
I) (6.8)
where I is the d d identity matrix. We now dene
= max
T
( max
xinterior(T)
(x)/ min
xinterior(T)
(x)).
We assume that
is a modest constant. This assumption is realistic because if has
steep gradients in a portion of
, then small elements are required in that portion
of the domain for nite element accuracy. Note that can have discontinuous jumps
of arbitrarily large magnitude and yet
will still be modest provided that the large
jumps occur along boundaries of mesh elements (so that the max and min within any
particular element do not involve the discontinuity). As before, it suces to dene
as the maximum ratio over only those points x from the list
T
(r
1
), . . . ,
T
(r
q
).
With this assumption, it is not hard to show that D
T
is a well conditioned diagonal
matrix. Because it is well conditioned, it may be commuted with J
T
to obtain the
10
factorization that is ultimately needed for our preconditioner. Let M
Q
be dened as
max
i=1,...,q
i
and m
Q
as min
i=1,...,q
i
. Clearly these constants depend only on the
quadrature scheme. We introduce an (l 1)m(l 1)m diagonal matrix
D and write
K = A
T
D
1/2
J
T
J
D
1/2
A where
J is chosen so that J
T
DJ =
D
1/2
J
T
J
D
1/2
. To make
this identity hold, we take
J = D
1/2
J
D
1/2
.
Now we explain how to dene
D. It is the diagonal matrix with m diagonal
blocks, one for each element, such that
D
T
(the submatrix associated with element
T) is (
T
(r
1
)) det(D
T
(r
1
))
2
T
I, where r
1
is the rst quadrature point and I denotes
the (l 1) (l 1) identity matrix.
We have the following analog of Lemma 6.1.
Lemma 6.2.
max
(
J)
1/2
(T )
d/2
M
1/2
Q
Q,p
, (6.9)
and
min
(
J)
1/2
(T )
d/21
m
1/2
Q
Q,p
. (6.10)
Proof. Let the diagonal block of
J associated with T be denoted
J
T
. Observe
that
J
T
= D
1/2
T
R
T
S
Q,p
D
1/2
T
= (
T
(r
1
))
1/2
det(D
T
(r
1
))
1/2
1
T
D
1/2
T
R
T
S
Q,p
=
D
T
R
T
S
Q,p
.
where
D
T
= (
T
(r
1
))
1/2
det(D
T
(r
1
))
1/2
1
T
D
1/2
T
. Let us write out a formula for
the entries of this diagonal matrix:
D
T
= (
T
(r
1
))
1/2
det(D
T
(r
1
))
1/2
1
T
D
1/2
T
= (
T
(r
1
))
1/2
det(D
T
(r
1
))
1/2
diag((
T
(r
1
)) det(D
T
(r
1
))
1
I, . . . , (
T
(r
q
)) det(D
T
(r
q
))
q
I)
1/2
= diag
_
(
T
(r
1
)) det(D
T
(r
1
))
1
I
(
T
(r
1
)) det(D
T
(r
1
))
, . . . ,
(
T
(r
q
)) det(D
T
(r
q
))
q
I
(
T
(r
1
)) det(D
T
(r
1
))
_
1/2
.
Recalling the denition of
as well as (6.5), this formula makes it apparent that
max
(
D
T
) (
(T )M
Q
)
1/2
and
min
(
D
T
)
1/2
(T )
d/2
m
1/2
Q
.
Use the bounds on singular values of R
T
and S
Q,p
derived in the proof of Lemma 6.1
to conclude the theorem.
11
7. The piecewise linear two-dimensional case. In this section, we specialize
the theory developed so far to the case that p = 1, d = 2, q = 1, r
1
= (1/3, 1/3),
1
= 1/2, that is, the case of piecewise linear two-dimensional triangular elements in-
tegrated with midpoint quadrature, which is a common case in practice. The notation
is simplied, and several other simplications to the analysis are possible.
Midpoint quadrature is accurate for constant and linear functions, and in partic-
ular, accurate up to degree 2p 2 so the two assumptions in Theorem 3.1 are valid
for this schema.
In this case, each
T
is an ane linear function,
is a polygon, and
1
is a
piecewise-linear path. We have l = 3, the three nodes are z
1
= (0, 0), z
2
= (1, 0),
z
3
= (0, 1), and the three ane linear shape functions are N
1
(, ) = 1 ,
N
2
(, ) = , N
3
(, ) = .
In the factorization of K = A
T
S
T
R
T
DRSA, D is a 2m 2m diagonal matrix
whose diagonal entries are of the form
T
det(D
T
)
2
T
, where
T
is the value of
(
T
(1/3, 1/3)), i.e., the value of at the mapped centroid of the element, det(D
T
)
is the determinant of the 2 2 matrix associated with mapping element T (which is
constant over the element), and
T
=
min
(D
T
).
The matrix S
Q,p
in this case turns out to be I (2 2 identity matrix) since the
two nite dierences U
T
(z
2
) U
T
(z
1
) and U
T
(z
3
) U
T
(z
1
) are exactly the two entries
of U
T
(which is constant over T
0
). Therefore,
Q,p
=
Q,p
= 1. The matrix R
T
is
also 2 2 and is D
1
T
/
T
. Thus, J
T
= D
1
T
/
T
.
For each element,
T
= D
1
T
and
T
= D
T
. Thus,
T
T
=
2
(D
T
).
It can be shown that this quantity, called the aspect ratio of T, is within a constant
factor of the reciprocal of the minimum angle of T. Thus, (T ) is a modest constant
provided no triangle in the mesh has a very sharp angle.
Specializing Lemma 6.1, we nd that
min
(J) 1/(T ) and
max
(J) 1. The
constant
for this case is identically 1 regardless of the function (x) since, as noted
above, it is the ratio of evaluated at two Gauss points of an element, but each
element has only one Gauss point. Similarly, the factor (T )
d/2
in Lemma 6.2 goes
away because it arises as a ratio of max versus min determinants in an element, but the
only point where the determinant is evaluated is at the centroid so the ratio is always
1. Therefore, Lemma 6.2 simplies to
max
(
J) 1/2 and
min
(
J) 1/(2(T )).
8. Preconditioning Strategy Summary. In this section we summarize how
our theory can be used to construct provably good preconditioners for nite ele-
ment systems. As mentioned earlier, we know from the literature how to precon-
dition Laplacian-type systems of the form A
T
DA where A is a node-arc incidence
matrix. We have a factorization K = A
T
JDJ
T
A, but the standard theory does
not apply because of the presence of J. So we refactor as in Lemma 6.2 to obtain
K = A
T
D
1/2
J
T
J
D
1/2
A. Now we can apply Theorem 2.4 by taking
V = A
T
D
1/2
, G =
J
T
,
and get
(K,
K) (
J)
2
where
K = A
T
DA. This is the rst step in our preconditioning scheme: approximate
K by
K.
As an aside, consider the general case of preconditioning A
T
JDJ
T
A with D
diagonal. Ideally, we want to choose
D such that (
J) is minimized subject to
12
J
T
DJ =
D
1/2
J
T
J
D
1/2
. Finding the optimal scaling
D is hard, but a good (near-
optimal) choice can be found eciently by choosing
D to be the symmetric scaling of
J
T
DJ that makes all the diagonal entries one. This implies
D
ii
=
_
J
T
DJ
_
ii
, i = 1, 2, . . .
On the other hand, there is no guarantee for the general case that the resulting
J is
well conditioned. In the case of the nite element method for (4.1), there is special
structure in J and D that allows us to prove a result like Lemma 6.2.
The preconditioning algorithm goes as follows. First compute
D as given above.
Then construct a good preconditioner for
K = A
T
DA. This is easy because many
preconditioners work well on symmetric diagonally dominant systems. For example,
a near-optimal preconditioner was recently given by Spielman and Teng [12]. We
remark that the algorithm of Spielman and Teng is quite complex (and has not been
implemented yet), so in practice a better choice might be the augmented max-weight
spanning trees rst suggested by Vaidya and described in [2, 4]. In fact, any solver can
be used for the symmetric M-matrix A
T
DA, including multigrid and AMG methods.
This allows us to reuse existing algorithms and software.
The triangle inequality (2.1) allows us to combine the two steps. In particular, if
Spielman and Tengs preconditioner M approximates
K well in the sense that (
K, M)
is small, and our result above shows that (K,
K) is also small, then (M, K) is small.
Our approximation scheme could be rewritten on an element-by-element basis.
Consider an element matrix K
i
= A
T
i
J
T
i
D
i
J
i
A
i
, where each row of A
i
is also a
row of A and thus has the special sparsity structure discussed earlier. We nd a
D
i
that is a good approximation to J
T
i
D
i
J
i
. Thus, we let
K
i
= A
T
i
D
i
A
i
. The
overall approximation is
K =
i
K
i
. Alternatively, we can take a global view as
in the above analysis and write
K = A
T
DA, where A = (A
1
; A
2
; . . . ; A
m
) and
D =
diag(
D
1
,
D
2
, . . . ,
D
m
). To simplify notation, we have adopted the global view in this
paper, but the reader should keep in mind that our approximation can take place
element by element, which may be important in an implementation.
8.1. Total work. Suppose that we have an iterative algorithm that can solve
symmetric, diagonally dominant M-matrices in f(n, m) operations, where n is the
number of variables and m the number of nonzeros. Spielman and Teng [12] have
recently proven that such systems can be solved in near-linear time, that is, f(n, m) =
O(m), where
O means that some logarithmic factors have been omitted. We have
shown that all nite element systems that discretize (4.1) in the usual way can be
solved using such an algorithm and the condition number only increases by a factor
(
J)
2
. Thus the number of iterations (and the work) increases by at most (
J).
Consequently, we can solve nite element systems with work (
J) f(n, m). When
(
J) is bounded by a constant, the total work is asymptotically the same as for
symmetric, diagonally dominant M-matrices, that is, almost linear. This is true even
for arbitrarily ill-conditioned systems.
For a given problem instance it is easy to estimate (
J), since
J can readily
be computed from the given J. When J is block diagonal, the extreme eigenvalues
of each block can be computed cheaply. One interpretation of our analysis is that
the condition number (of the preconditioned system) is proportional to a factor that
depends on the quality of the mesh.
9. Open questions. This work is the rst to extend support-tree methods,
which previously have been shown to be good preconditioners for diagonally dominant
13
matrices with negative o-diagonal entries, to the class of nite element matrices. We
have shown that the scope of the method includes the standard scalar elliptic boundary
value problem, but perhaps the scope of nite element problems that can be tackled
with this method could be expanded further.
One generalization would be the class of problems ((x)u) = f, where (x)
is a spatially varying d d symmetric positive denite matrix. This generalization
would present problems for our current analysis in the case that (x) is highly ill-
conditioned. It would still be straightforward to write K = A
T
J
T
DJA where D is
now block diagonal, but our analysis of the introduction of
J would run into trouble
because the dq dq diagonal blocks of D are no longer individually well conditioned.
It would also be interesting to tackle vector problems such as linear elasticity
or Stokes ow, or higher-order equations like the biharmonic equation. It seems
likely that our techniques can extend to at least some of these problems since they
all have a symmetric positive denite weak form. A further generalization would be
to unsymmetric problems like the convection-diusion equation. The latter class of
problems would require substantial rethinking of the whole approach since condition
number reduction, which is very relevant for the application of conjugate gradient to
symmetric positive denite systems, is less relevant to the application of GMRES to
unsymmetric systems.
Our analysis is based on condition numbers (support numbers). One drawback of
this approach is that the convergence and work estimates may be too pessimistic. For
instance, the condition number of the preconditioned linear systems depends on (T ),
the worst aspect ratio of any element in the mesh. If there is only one poorly shaped
element in the mesh, we expect iterative solvers will only take a few extra iterations
since changing a single element implies a low-rank correction to the assembled stiness
matrix. Any analysis based on condition numbers will be unable to capture this eect.
A related open issue is whether we can exploit recent work in mesh quality metrics
[10] to show that good meshes both have small error in the FEM approximation and
also produce linear systems that can be well approximated by diagonally dominant
systems.
Another point to make about our method is that, although the condition number
of the preconditioned system has an upper bound independent of
R
= max
x
(x)/ min
x
(x),
there will still be a loss of signicant digits due to roundo error when using our
method in the case that R