Fundamentals of CFD
Fundamentals of CFD
Fundamentals of CFD
Dynamics
David W. Zingg
University of Toronto Institute for Aerospace Studies
1 INTRODUCTION 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Problem Specification and Geometry Preparation . . . . . . . 2
1.2.2 Selection of Governing Equations and Boundary Conditions . 3
1.2.3 Selection of Gridding Strategy and Numerical Method . . . . 3
1.2.4 Assessment and Interpretation of Results . . . . . . . . . . . . 4
1.3 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 FINITE-DIFFERENCE APPROXIMATIONS 21
3.1 Meshes and Finite-Difference Notation . . . . . . . . . . . . . . . . . 21
3.2 Space Derivative Approximations . . . . . . . . . . . . . . . . . . . . 24
3.3 Finite-Difference Operators . . . . . . . . . . . . . . . . . . . . . . . 25
3.3.1 Point Difference Operators . . . . . . . . . . . . . . . . . . . . 25
3.3.2 Matrix Difference Operators . . . . . . . . . . . . . . . . . . . 25
3.3.3 Periodic Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3.4 Circulant Matrices . . . . . . . . . . . . . . . . . . . . . . . . 30
iii
3.4 Constructing Differencing Schemes of Any Order . . . . . . . . . . . . 31
3.4.1 Taylor Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.4.2 Generalization of Difference Formulas . . . . . . . . . . . . . . 34
3.4.3 Lagrange and Hermite Interpolation Polynomials . . . . . . . 35
3.4.4 Practical Application of Padé Formulas . . . . . . . . . . . . . 37
3.4.5 Other Higher-Order Schemes . . . . . . . . . . . . . . . . . . . 38
3.5 Fourier Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5.1 Application to a Spatial Operator . . . . . . . . . . . . . . . . 39
3.6 Difference Operators at Boundaries . . . . . . . . . . . . . . . . . . . 43
3.6.1 The Linear Convection Equation . . . . . . . . . . . . . . . . 44
3.6.2 The Diffusion Equation . . . . . . . . . . . . . . . . . . . . . . 46
3.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5 FINITE-VOLUME METHODS 71
5.1 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.2 Model Equations in Integral Form . . . . . . . . . . . . . . . . . . . . 73
5.2.1 The Linear Convection Equation . . . . . . . . . . . . . . . . 73
5.2.2 The Diffusion Equation . . . . . . . . . . . . . . . . . . . . . . 74
5.3 One-Dimensional Examples . . . . . . . . . . . . . . . . . . . . . . . 74
5.3.1 A Second-Order Approximation to the Convection Equation . 75
5.3.2 A Fourth-Order Approximation to the Convection Equation . 77
5.3.3 A Second-Order Approximation to the Diffusion Equation . . 78
5.4 A Two-Dimensional Example . . . . . . . . . . . . . . . . . . . . . . 80
5.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
10 MULTIGRID 191
10.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
10.1.1 Eigenvector and Eigenvalue Identification with Space Frequencies191
10.1.2 Properties of the Iterative Method . . . . . . . . . . . . . . . 192
10.2 The Basic Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
10.3 A Two-Grid Process . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
10.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
INTRODUCTION
1.1 Motivation
The material in this book originated from attempts to understand and systemize nu-
merical solution techniques for the partial differential equations governing the physics
of fluid flow. As time went on and these attempts began to crystallize, underlying
constraints on the nature of the material began to form. The principal such constraint
was the demand for unification. Was there one mathematical structure which could
be used to describe the behavior and results of most numerical methods in common
use in the field of fluid dynamics? Perhaps the answer is arguable, but the authors
believe the answer is affirmative and present this book as justification for that be-
lief. The mathematical structure is the theory of linear algebra and the attendant
eigenanalysis of linear systems.
The ultimate goal of the field of computational fluid dynamics (CFD) is to under-
stand the physical events that occur in the flow of fluids around and within designated
objects. These events are related to the action and interaction of phenomena such
as dissipation, diffusion, convection, shock waves, slip surfaces, boundary layers, and
turbulence. In the field of aerodynamics, all of these phenomena are governed by
the compressible Navier-Stokes equations. Many of the most important aspects of
these relations are nonlinear and, as a consequence, often have no analytic solution.
This, of course, motivates the numerical solution of the associated partial differential
equations. At the same time it would seem to invalidate the use of linear algebra for
the classification of the numerical methods. Experience has shown that such is not
the case.
As we shall see in a later chapter, the use of numerical methods to solve partial
differential equations introduces an approximation that, in effect, can change the
form of the basic partial differential equations themselves. The new equations, which
1
2 CHAPTER 1. INTRODUCTION
are the ones actually being solved by the numerical process, are often referred to as
the modified partial differential equations. Since they are not precisely the same as
the original equations, they can, and probably will, simulate the physical phenomena
listed above in ways that are not exactly the same as an exact solution to the basic
partial differential equation. Mathematically, these differences are usually referred to
as truncation errors. However, the theory associated with the numerical analysis of
fluid mechanics was developed predominantly by scientists deeply interested in the
physics of fluid flow and, as a consequence, these errors are often identified with a
particular physical phenomenon on which they have a strong effect. Thus methods are
said to have a lot of “artificial viscosity” or said to be highly dispersive. This means
that the errors caused by the numerical approximation result in a modified partial
differential equation having additional terms that can be identified with the physics
of dissipation in the first case and dispersion in the second. There is nothing wrong,
of course, with identifying an error with a physical process, nor with deliberately
directing an error to a specific physical process, as long as the error remains in some
engineering sense “small”. It is safe to say, for example, that most numerical methods
in practical use for solving the nondissipative Euler equations create a modified partial
differential equation that produces some form of dissipation. However, if used and
interpreted properly, these methods give very useful information.
Regardless of what the numerical errors are called, if their effects are not thor-
oughly understood and controlled, they can lead to serious difficulties, producing
answers that represent little, if any, physical reality. This motivates studying the
concepts of stability, convergence, and consistency. On the other hand, even if the
errors are kept small enough that they can be neglected (for engineering purposes),
the resulting simulation can still be of little practical use if inefficient or inappropriate
algorithms are used. This motivates studying the concepts of stiffness, factorization,
and algorithm development in general. All of these concepts we hope to clarify in
this book.
1.2 Background
The field of computational fluid dynamics has a broad range of applicability. Indepen-
dent of the specific application under study, the following sequence of steps generally
must be followed in order to obtain a satisfactory solution.
1.3 Overview
It should be clear that successful simulation of fluid flows can involve a wide range of
issues from grid generation to turbulence modelling to the applicability of various sim-
plified forms of the Navier-Stokes equations. Many of these issues are not addressed
in this book. Some of them are presented in the books by Anderson, Tannehill, and
Pletcher [1] and Hirsch [2]. Instead we focus on numerical methods, with emphasis
on finite-difference and finite-volume methods for the Euler and Navier-Stokes equa-
tions. Rather than presenting the details of the most advanced methods, which are
still evolving, we present a foundation for developing, analyzing, and understanding
such methods.
Fortunately, to develop, analyze, and understand most numerical methods used to
find solutions for the complete compressible Navier-Stokes equations, we can make use
of much simpler expressions, the so-called “model” equations. These model equations
isolate certain aspects of the physics contained in the complete set of equations. Hence
their numerical solution can illustrate the properties of a given numerical method
when applied to a more complicated system of equations which governs similar phys-
ical phenomena. Although the model equations are extremely simple and easy to
solve, they have been carefully selected to be representative, when used intelligently,
of difficulties and complexities that arise in realistic two- and three-dimensional fluid
flow simulations. We believe that a thorough understanding of what happens when
1.4. NOTATION 5
numerical approximations are applied to the model equations is a major first step in
making confident and competent use of numerical approximations to the Euler and
Navier-Stokes equations. As a word of caution, however, it should be noted that,
although we can learn a great deal by studying numerical methods as applied to the
model equations and can use that information in the design and application of nu-
merical methods to practical problems, there are many aspects of practical problems
which can only be understood in the context of the complete physical systems.
1.4 Notation
The notation is generally explained as it is introduced. Bold type is reserved for real
physical vectors, such as velocity. The vector symbol is used for the vectors (or
column matrices) which contain the values of the dependent variable at the nodes
of a grid. Otherwise, the use of a vector consisting of a collection of scalars should
be apparent from the context and is not identified by any special notation. For
example, the variable u can denote a scalar Cartesian velocity component in the Euler
and Navier-Stokes equations, a scalar quantity in the linear convection and diffusion
equations, and a vector consisting of a collection of scalars in our presentation of
hyperbolic systems. Some of the abbreviations used throughout the text are listed
and defined below.
We start out by casting our equations in the most general form, the integral conserva-
tion-law form, which is useful in understanding the concepts involved in finite-volume
schemes. The equations are then recast into divergence form, which is natural for
finite-difference schemes. The Euler and Navier-Stokes equations are briefly discussed
in this Chapter. The main focus, though, will be on representative model equations,
in particular, the convection and diffusion equations. These equations contain many
of the salient mathematical and physical features of the full Navier-Stokes equations.
The concepts of convection and diffusion are prevalent in our development of nu-
merical methods for computational fluid dynamics, and the recurring use of these
model equations allows us to develop a consistent framework of analysis for consis-
tency, accuracy, stability, and convergence. The model equations we study have two
properties in common. They are linear partial differential equations (PDE’s) with
coefficients that are constant in both space and time, and they represent phenomena
of importance to the analysis of certain aspects of fluid dynamic problems.
In this equation, Q is a vector containing the set of variables which are conserved,
e.g., mass, momentum, and energy, per unit volume. The equation is a statement of
7
8 CHAPTER 2. CONSERVATION LAWS AND THE MODEL EQUATIONS
the conservation of these quantities in a finite region of space with volume V (t) and
surface area S(t) over a finite interval of time t2 − t1 . In two dimensions, the region
of space, or cell, is an area A(t) bounded by a closed contour C(t). The vector n is
a unit vector normal to the surface pointing outward, F is a set of vectors, or tensor,
containing the flux of Q per unit area per unit time, and P is the rate of production
of Q per unit volume per unit time. If all variables are continuous in time, then Eq.
2.1 can be rewritten as
d
QdV + n.FdS = P dV (2.2)
dt V (t) S(t) V (t)
Those methods which make various numerical approximations of the integrals in Eqs.
2.1 and 2.2 and find a solution for Q on that basis are referred to as finite-volume
methods. Many of the advanced codes written for CFD applications are based on the
finite-volume concept.
On the other hand, a partial derivative form of a conservation law can also be
derived. The divergence form of Eq. 2.2 is obtained by applying Gauss’s theorem to
the flux integral, leading to
∂Q
+ ∇.F = P (2.3)
∂t
where ∇. is the well-known divergence operator given, in Cartesian coordinates, by
∂ ∂ ∂
∇. ≡ i +j +k . (2.4)
∂x ∂y ∂z
and i, j, and k are unit vectors in the x, y, and z coordinate directions, respectively.
Those methods which make various approximations of the derivatives in Eq. 2.3 and
find a solution for Q on that basis are referred to as finite-difference methods.
where ρ is the fluid density, u is the velocity, e is the total energy per unit volume, p is
the pressure, T is the temperature, µ is the coefficient of viscosity, and κ is the thermal
conductivity. The total energy e includes internal energy per unit volume ρ (where
is the internal energy per unit mass) and kinetic energy per unit volume ρu2 /2.
These equations must be supplemented by relations between µ and κ and the fluid
state as well as an equation of state, such as the ideal gas law. Details can be found
in Anderson, Tannehill, and Pletcher [1] and Hirsch [2]. Note that the convective
fluxes lead to first derivatives in space, while the viscous and heat conduction terms
involve second derivatives. This form of the equations is called conservation-law or
conservative form. Non-conservative forms can be obtained by expanding derivatives
of products using the product rule or by introducing different dependent variables,
such as u and p. Although non-conservative forms of the equations are analytically
the same as the above form, they can lead to quite different numerical solutions in
terms of shock strength and shock speed, for example. Thus the conservative form is
appropriate for solving flows with features such as shock waves.
Many flows of engineering interest are steady (time-invariant), or at least may be
treated as such. For such flows, we are often interested in the steady-state solution of
the Navier-Stokes equations, with no interest in the transient portion of the solution.
The steady solution to the one-dimensional Navier-Stokes equations must satisfy
∂E
=0 (2.7)
∂x
If we neglect viscosity and heat conduction, the Euler equations are obtained. In
two-dimensional Cartesian coordinates, these can be written as
∂Q ∂E ∂F
+ + =0 (2.8)
∂t ∂x ∂y
with
q1 ρ ρu ρv
q ρu ρu2 + p ρuv
Q= =
2
, E = , F = 2 (2.9)
q3 ρv ρuv ρv + p
q4 e u(e + p) v(e + p)
where u and v are the Cartesian velocity components. Later on we will make use of
the following form of the Euler equations as well:
∂Q ∂Q ∂Q
+A +B =0 (2.10)
∂t ∂x ∂y
to derive the flux Jacobian matrices, we must first write the flux vectors E and F in
terms of the conservative variables, q1 , q2 , q3 , and q4 , as follows:
q2
E1
3−γ q22 γ−1 q32
(γ − 1)q4 + −
E2 2 q1 2 q1
E= = (2.11)
q3 q2
E3 q1
E4
q23 q32 q2
γ q4q1q2 − γ−1
2 q12
+ q12
F1 q3
q3 q2
F2
q1
F = = (2.12)
3−γ q32 γ−1 q22
F3 (γ − 1)q4 + −
2 q1 2 q1
F4 q22 q3 q33
γ q4q1q3 − γ−1
2 q12
+ q12
We have assumed that the pressure satisfies p = (γ − 1)[e − ρ(u2 + v 2 )/2] from the
ideal gas law, where γ is the ratio of specific heats, cp /cv . From this it follows that
the flux Jacobian of E can be written in terms of the conservative variables as
0 1 0 0
a21 (3 − γ) qq21 (1 − γ) qq31 γ−1
∂Ei
A= = (2.13)
∂qj q3
q2
− qq21 qq31 q1 q1 0
a41 a42 a43 q
γ q21
where
2 2
γ − 1 q3 3 − γ q2
a21 = −
2 q1 2 q1
2.2. THE NAVIER-STOKES AND EULER EQUATIONS 11
3 2
q2 q3 q2 q4 q2
a41 = (γ − 1) + −γ
q1 q1 q1 q1 q1
2 2
q4 γ − 1 q2 q3
a42 = γ − 3 +
q1 2 q1 q1
q2 q3
a43 = −(γ − 1) (2.14)
q1 q1
and in terms of the primitive variables as
0 1 0 0
a21 (3 − γ)u (1 − γ)v (γ − 1)
A= (2.15)
−uv v u 0
a41 a42 a43 γu
where
γ−1 2 3−γ 2
a21 = v − u
2 2
ue
a41 = (γ − 1)u(u2 + v 2 ) − γ
ρ
e γ−1
a42 = γ − (3u2 + v 2 )
ρ 2
(1) In one type, the scalar quantity u is given on one boundary, corresponding
to a wave entering the domain through this “inflow” boundary. No bound-
ary condition is specified at the opposite side, the “outflow” boundary. This
is consistent in terms of the well-posedness of a 1st -order PDE. Hence the
wave leaves the domain through the outflow boundary without distortion or
reflection. This type of phenomenon is referred to, simply, as the convection
problem. It represents most of the “usual” situations encountered in convect-
ing systems. Note that the left-hand boundary is the inflow boundary when
a is positive, while the right-hand boundary is the inflow boundary when a is
negative.
(2) In the other type, the flow being simulated is periodic. At any given time,
what enters on one side of the domain must be the same as that which is
leaving on the other. This is referred to as the biconvection problem. It is
the simplest to study and serves to illustrate many of the basic properties of
numerical methods applied to problems involving convection, without special
consideration of boundaries. Hence, we pay a great deal of attention to it in
the initial chapters.
Now let us consider a situation in which the initial condition is given by u(x, 0) =
u0 (x), and the domain is infinite. It is easy to show by substitution that the exact
solution to the linear convection equation is then
u(x, t) = u0 (x − at) (2.18)
The initial waveform propagates unaltered with speed |a| to the right if a is positive
and to the left if a is negative. With periodic boundary conditions, the waveform
travels through one boundary and reappears at the other boundary, eventually re-
turning to its initial position. In this case, the process continues forever without any
2.3. THE LINEAR CONVECTION EQUATION 13
change in the shape of the solution. Preserving the shape of the initial condition
u0 (x) can be a difficult challenge for a numerical method.
where the wavenumbers are often ordered such that κ1 ≤ κ2 ≤ · · · ≤ κM . Since the
wave equation is linear, the solution is obtained by summing solutions of the form of
Eq. 2.23, giving
M
u(x, t) = fm (0)eiκm (x−at) (2.26)
m=1
Dispersion and dissipation resulting from a numerical approximation will cause the
shape of the solution to change from that of the original waveform.
∂u ∂2u
=ν 2 (2.27)
∂t ∂x
where ν is a positive real constant. For example, with u representing the tempera-
ture, this parabolic PDE governs the diffusion of heat in one dimension. Boundary
conditions can be periodic, Dirichlet (specified u), Neumann (specified ∂u/∂x), or
mixed Dirichlet/Neumann.
In contrast to the linear convection equation, the diffusion equation has a nontrivial
steady-state solution, which is one that satisfies the governing PDE with the partial
derivative in time equal to zero. In the case of Eq. 2.27, the steady-state solution
must satisfy
∂2u
=0 (2.28)
∂x2
Therefore, u must vary linearly with x at steady state such that the boundary con-
ditions are satisfied. Other steady-state solutions are obtained if a source term g(x)
is added to Eq. 2.27, as follows:
∂u ∂2u
=ν − g(x) (2.29)
∂t ∂x2
∂2 u
− g(x) = 0 (2.30)
∂x2
2.4. THE DIFFUSION EQUATION 15
The steady-state solution (t → ∞) is simply h(x). Eq. 2.37 shows that high wavenum-
ber components (large κm ) of the solution decay more rapidly than low wavenumber
components, consistent with the physics of diffusion.
16 CHAPTER 2. CONSERVATION LAWS AND THE MODEL EQUATIONS
2.6 Problems
1. Show that the 1-D Euler equations can be written in terms of the primitive
variables R = [ρ, u, p]T as follows:
∂R ∂R
+M =0
∂t ∂x
where
u ρ 0
M =
0 u ρ−1
0 γp u
18 CHAPTER 2. CONSERVATION LAWS AND THE MODEL EQUATIONS
3. Derive the flux Jacobian matrix A = ∂E/∂Q for the 1-D Euler equations result-
ing from the conservative variable formulation (Eq. 2.5). Find its eigenvalues
and compare with those obtained in question 2.
4. Show that the two matrices M and A derived in questions 1 and 3, respectively,
are related by a similarity transform. (Hint: make use of the matrix S =
∂Q/∂R.)
5. Write the 2-D diffusion equation, Eq. 2.31, in the form of Eq. 2.2.
6. Given the initial condition u(x, 0) = sin x defined on 0 ≤ x ≤ 2π, write it in the
form of Eq. 2.25, that is, find the necessary values of fm (0). (Hint: use M = 2
with κ1 = 1 and κ2 = −1.) Next consider the same initial condition defined
only at x = 2πj/4, j = 0, 1, 2, 3. Find the values of fm (0) required to reproduce
the initial condition at these discrete points using M = 4 with κm = m − 1.
7. Plot the first three basis functions used in constructing the exact solution to
the diffusion equation in Section 2.4.2. Next consider a solution with boundary
conditions ua = ub = 0, and initial conditions from Eq. 2.33 with fm (0) = 1
for 1 ≤ m ≤ 3, fm (0) = 0 for m > 3. Plot the initial condition on the domain
0 ≤ x ≤ π. Plot the solution at t = 1 with ν = 1.
9. The Cauchy-Riemann equations are formed from the coupling of the steady
compressible continuity (conservation of mass) equation
∂ρu ∂ρv
+ =0
∂x ∂y
and the vorticity definition
∂v ∂u
ω=− + =0
∂x ∂y
2.6. PROBLEMS 19
where ω = 0 for irrotational flow. For isentropic and homenthalpic flow, the
system is closed by the relation
γ−1 2 1
γ−1
ρ= 1− u + v2 − 1
2
Note that the variables have been nondimensionalized. Combining the two
PDE’s, we have
∂f (q) ∂g(q)
+ =0
∂x ∂y
where
u −ρu −ρv
q= , f= , g=
v v −u
FINITE-DIFFERENCE
APPROXIMATIONS
In common with the equations governing unsteady fluid flow, our model equations
contain partial derivatives with respect to both space and time. One can approxi-
mate these simultaneously and then solve the resulting difference equations. Alterna-
tively, one can approximate the spatial derivatives first, thereby producing a system
of ordinary differential equations. The time derivatives are approximated next, lead-
ing to a time-marching method which produces a set of difference equations. This
is the approach emphasized here. In this chapter, the concept of finite-difference
approximations to partial derivatives is presented. These can be applied either to
spatial derivatives or time derivatives. Our emphasis in this chapter is on spatial
derivatives; time derivatives are treated in Chapter 6. Strategies for applying these
finite-difference approximations will be discussed in Chapter 4.
All of the material below is presented in a Cartesian system. We emphasize the
fact that quite general classes of meshes expressed in general curvilinear coordinates
in physical space can be transformed to a uniform Cartesian mesh with equispaced
intervals in a so-called computational space, as shown in Figure 3.1. The computational
space is uniform; all the geometric variation is absorbed into variable coefficients of the
transformed equations. For this reason, in much of the following accuracy analysis,
we use an equispaced Cartesian system without being unduly restrictive or losing
practical application.
21
22 CHAPTER 3. FINITE-DIFFERENCE APPROXIMATIONS
t
Grid or
Node
Points
n+1
∆x
n
∆t
n-1
Derivatives are expressed according to the usual conventions. Thus for partial
derivatives in space or time we use interchangeably
∂u ∂u ∂2 u
∂x u = , ∂t u = , ∂xx u = , etc. (3.4)
∂x ∂t ∂x2
du
u = (3.5)
dt
In this text, subscripts on dependent variables are never used to express derivatives.
Thus ux will not be used to represent the first derivative of u with respect to x.
The notation for difference approximations follows the same philosophy, but (with
one exception) it is not unique. By this we mean that the symbol δ is used to represent
a difference approximation to a derivative such that, for example,
but the precise nature (and order) of the approximation is not carried in the symbol
δ. Other ways are used to determine its precise meaning. The one exception is the
symbol ∆, which is defined such that
Local difference approximations to a given partial derivative can be formed from linear
combinations of uj and uj+k for k = ±1, ±2, · · ·.
For example, consider the Taylor series expansion for uj+1:
∂u 1 ∂2u 1 ∂nu
uj+1 = uj + (∆x) + (∆x)2 + · · · + (∆x)n +··· (3.9)
∂x j
2 ∂x2 j
n! ∂xn j
When expressed in this manner, it is clear that the discrete terms on the left side of
the equation represent a first derivative with a certain amount of error which appears
on the right side of the equal sign. It is also clear that the error depends on the
grid spacing to a certain order. The error term containing the grid spacing to the
lowest power gives the order of the method. From Eq.
3.10, we see that the expression
(uj+1 −uj )/∆x is a first-order approximation to ∂x . Similarly, Eq. 3.11 shows that
∂u
j
(uj+1 − uj−1)/(2∆x) is a second-order approximation to a first derivative. The latter
is referred to as the three-point centered difference approximation, and one often sees
the summary result presented in the form
∂u uj+1 − uj−1
= + O(∆x2 ) (3.12)
∂x j
2∆x
1
We assume that u(x, t) is continuously differentiable.
3.3. FINITE-DIFFERENCE OPERATORS 25
These are the basis for point difference operators since they give an approximation to
a derivative at one discrete point in a mesh in terms of surrounding points. However,
neither of these expressions tells us how other points in the mesh are differenced or
how boundary conditions are enforced. Such additional information requires a more
sophisticated formulation.
a 1 2 3 4 b
x= 0 − − − − π
j= 1 · · M
Now impose Dirichlet boundary conditions, u(0) = ua , u(π) = ub and use the
centered difference approximation given by Eq. 3.15 at every point in the mesh. We
2
We will derive the second derivative operator shortly.
26 CHAPTER 3. FINITE-DIFFERENCE APPROXIMATIONS
1
(δxx u)1 = (ua − 2u1 + u2 )
∆x2
1
(δxx u)2 = (u1 − 2u2 + u3)
∆x2
1
(δxx u)3 = (u2 − 2u3 + u4)
∆x2
1
(δxx u)4 = (u3 − 2u4 + ub) (3.16)
∆x2
Writing these equations in the more suggestive form
it is clear that we can express them in a vector-matrix form, and further, that the
resulting matrix has a very special form. Introducing
u1 ua
u2 1 0
u = =
, bc (3.18)
u3 ∆x2 0
u4 ub
and
−2 1
1 1 −2 1
A= 2 1 −2
(3.19)
∆x 1
1 −2
This example illustrates a matrix difference operator. Each line of a matrix differ-
ence operator is based on a point difference operator, but the point operators used
from line to line are not necessarily the same. For example, boundary conditions may
dictate that the lines at or near the bottom or top of the matrix be modified. In the
extreme case of the matrix difference operator representing a spectral method, none
3.3. FINITE-DIFFERENCE OPERATORS 27
of the lines is the same. The matrix operators representing the three-point central-
difference approximations for a first and second derivative with Dirichlet boundary
conditions on a four-point mesh are
0 1 −2 1
1 −1 0 1
1 1 −2 1
δx = , δxx = 2 (3.21)
2∆x −1 0 1 ∆x 1 −2 1
−1 0 1 −2
As a further example, replace the fourth line in Eq. 3.16 by the following point
operator for a Neumann boundary condition (See Section 3.6.):
2 1 ∂u 2 1
(δxx u)4 = − (u4 − u3 ) (3.22)
3 ∆x ∂x b
3 ∆x2
Each of these matrix difference operators is a square matrix with elements that are
all zeros except for those along bands which are clustered around the central diagonal.
We call such a matrix a banded matrix and introduce the notation
b c 1
a b c
B(M : a, b, c) =
...
(3.25)
..
a b c
.
a b M
where the matrix dimensions are M × M . Use of M in the argument is optional,
and the illustration is given for a simple tridiagonal matrix although any number of
28 CHAPTER 3. FINITE-DIFFERENCE APPROXIMATIONS
bands is a possibility. A tridiagonal matrix without constants along the bands can be
expressed as B(a, b, c). The arguments for a banded matrix are always odd in number
and the central one always refers to the central diagonal.
We can now generalize our previous examples. Defining u as3
u1
u2
u3
u = (3.26)
..
.
uM
Notice that the matrix operators given by Eqs. 3.27 and 3.29 carry more informa-
tion than the point operator given by Eq. 3.15. In Eqs. 3.27 and 3.29, the boundary
conditions have been uniquely specified and it is clear that the same point operator
has been applied at every point in the field except at the boundaries. The ability to
specify in the matrix derivative operator the exact nature of the approximation at the
3
Note that u is a function of time only since each element corresponds to one specific spatial
location.
3.3. FINITE-DIFFERENCE OPERATORS 29
various points in the field including the boundaries permits the use of quite general
constructions which will be useful later in considerations of stability.
Since we make considerable use of both matrix and point operators, it is important
to establish a relation between them. A point operator is generally written for some
derivative at the reference point j in terms of neighboring values of the function. For
example
might be the point operator for a first derivative. The corresponding matrix operator
has for its arguments the coefficients giving the weights to the values of the function
at the various locations. A j-shift in the point operator corresponds to a diagonal
shift in the matrix operator. Thus the matrix equivalent of Eq. 3.30 is
Note the addition of a zero in the fifth element which makes it clear that b is the
coefficient of uj .
··· 7 8 1 2 3 4 5 6 7 8 1 2 ···
x = − − 0 − − − − − − − 2π −
j= 0 1 · · · · · · M
The matrix that represents differencing schemes for scalar equations on a periodic
mesh is referred to as a periodic matrix. A typical periodic tridiagonal matrix operator
30 CHAPTER 3. FINITE-DIFFERENCE APPROXIMATIONS
8 2
7 3
6 4
5
Figure 3.3: Eight points on a circular mesh.
given by
b c a 1
a b c
= ...
Bp (M : a, b, c) (3.34)
..
a b c
.
c a b M
and
−2 1 1
1 1 −2 1
1
(δxx )p = = Bp (1, −2, 1) (3.35)
∆x2 1 −2 1 ∆x2
1 1 −2
Clearly, these special cases of periodic operators are also circulant operators. Later
on we take advantage of this special property. Notice that there are no boundary
condition vectors since this information is all interior to the matrices themselves.
∂2u 1
− (a uj−1 + b uj + c uj+1) = ?
∂x2 j ∆x2
· · ·
2 3 4
∆x·
∆x
2
∆x
3
∆x
4
∂u ∂ u ∂ u ∂ u
uj ∂x j ∂x2 j ∂x3 j ∂x4 j
∂2u
∆x2 · ∂x2 j
1
Table 3.1. Taylor table for centered 3-point Lagrangian approximation to a second
derivative.
The table is constructed so that some of the algebra is simplified. At the top of the
table we see an expression with a question mark. This represents one of the questions
that a study of this table can answer; namely, what is the local error caused by the
use of this approximation? Notice that all of the terms in the equation appear in
a column at the left of the table (although, in this case, ∆x2 has been multiplied
into each term in order to simplify the terms to be put into the table). Then notice
that at the head of each column there appears the common factor that occurs in the
expansion of each term about the point j, that is,
∂k u
∆x · k
k = 0, 1, 2, · · ·
∂xk j
The columns to the right of the leftmost one, under the headings, make up the Taylor
table. Each entry is the coefficient of the term at the top of the corresponding column
in the Taylor series expansion of the term to the left of the corresponding row. For
example, the last row in the table corresponds to the Taylor series expansion of
−c uj+1:
1 ∂u 1 ∂2 u
−c uj+1 = −c uj − c · (1) · ∆x · − c · (1)2 · ∆x2 ·
1 ∂x j
2 ∂x2 j
3 4
1 ∂ u 1 ∂ u
−c · (1)3 · ∆x3 · − c · (1)4 · ∆x4 · − · · · (3.36)
6 ∂x3 j
24 ∂x4 j
Consider the sum of each of these columns. To maximize the order of accuracy
of the method, we proceed from left to right and force, by the proper choice of a, b,
and c, these sums to be zero. One can easily show that the sums of the first three
columns are zero if we satisfy the equation
−1 −1 −1 a 0
0 −1
1 b = 0
−1 0 −1 c −2
In this case ert occurs at the fifth column in the table (for this example all even
columns will vanish by symmetry) and one finds
1 −a −c 4
4 ∂ u −∆x2 ∂ 4 u
ert = + ∆x = (3.37)
∆x2 24 24 ∂x4 j
12 ∂x4 j
Note that ∆x2 has been divided through to make the error term consistent. We
have just derived the familiar 3-point central-differencing point operator for a second
derivative
∂2u 1
− (uj−1 − 2uj + uj+1) = O(∆x2 ) (3.38)
∂x2 j ∆x2
∂u 1
− (a2 uj−2 + a1 uj−1 + b uj ) = ?
∂x j
∆x
· · ·
2 3 4
∆x·
∆x
2
∆x
3
∆x
4
∂u ∂ u ∂ u ∂ u
uj ∂x j ∂x2 j ∂x3 j ∂x4 j
∆x · ∂u
∂x j
1
Table 3.2. Taylor table for backward 3-point Lagrangian approximation to a first
derivative.
which gives [a2 , a1 , b] = 12 [1, −4, 3]. In this case the fourth column provides the leading
truncation error term:
1 8a2 a1 ∂3u ∆x2 ∂ 3 u
ert = + ∆x3 = (3.39)
∆x 6 6 ∂x3 j
3 ∂x3 j
where the ai are coefficients to be determined through the use of Taylor tables to
produce approximations of a given order. Clearly this process can be used to find
forward, backward, skewed, or central point operators of any order for any derivative.
It could be computer automated and extended to higher dimensions. More important,
however, is the fact that it can be further generalized. In order to do this, let us
approach the subject in a slightly different way, that is from the point of view of
interpolation formulas. These formulas are discussed in many texts on numerical
analysis.
K
u(x) = ak (x)uk (3.42)
k=0
where ak (x) are polynomials in x of degree K. The construction of the ak (x) can be
taken from the simple Lagrangian formula for quadratic interpolation (or extrapola-
tion) with non-equispaced points
Notice that the coefficient of each uk is one when x = xk , and zero when x takes
any other discrete value in the set. If we take the first or second derivative of u(x),
impose an equispaced mesh, and evaluate these derivatives at the appropriate dis-
crete point, we rederive the finite-difference approximations just presented. Finite-
difference schemes that can be derived from Eq. 3.42 are referred to as Lagrangian
approximations.
A generalization of the Lagrangian approach is brought about by using Hermitian
interpolation. To construct a polynomial for u(x), Hermite formulas use values of the
function and its derivative(s) at given points in space. Our illustration is for the case
in which discrete values of the function and its first derivative are used, producing
the expression
∂u
u(x) = ak (x)uk + bk (x) (3.44)
∂x k
36 CHAPTER 3. FINITE-DIFFERENCE APPROXIMATIONS
analogous to Eq. 3.44. An example formula is illustrated at the top of Table 3.3. Here
not only is the derivative at point j represented, but also included are derivatives at
points j − 1 and j + 1, which also must be expanded using Taylor series about point
j. This requires the following generalization of the Taylor series expansion given in
Eq. 3.8:
∞
∂mu
1 n
∂mu
n ∂
= (k∆x) (3.46)
∂xm j+k n=0 n! ∂xn ∂xm j
The derivative terms now have coefficients (the coefficient on the j point is taken
as one to simplify the algebra) which must be determined using the Taylor table
approach as outlined below.
∂u ∂u ∂u 1
d + +e − (auj−1 + buj + cuj+1 ) = ?
∂x j−1
∂x j
∂x j+1
∆x
· · · ·
2 3 4 5
∆x·
∆x
2
∆x
3
∆x
4
∆x
5
∂u ∂ u ∂ u ∂ u ∂ u
uj ∂x j ∂x2 j ∂x3 j ∂x4 j ∂x5 j
∆x · d ∂u
∂x j−1
d d · (-1) · 11 d · (-1)2 · 12 d · (-1)3 · 16 1
d · (-1)4 · 24
∆x · ∂u
∂x
1
j
∆x · ∂u
e ∂x e e · (1) · 11 e · (1)2 · 12 e · (1)3 · 16 1
e · (1)4 · 24
j+1
−a · uj−1 −a −a · (-1) · 11 −a · (-1)2 · 12 −a · (-1)3 · 16 −a · (-1)4 · 241 1
−a · (-1)5 · 120
−b · uj −b
−c · uj+1 −c −c · (1) · 11 −c · (1)2 · 12 −c · (1)3 · 16 1
−c · (1)4 · 24 1
−c · (1)5 · 120
Table 3.3. Taylor table for central 3-point Hermitian approximation to a first derivative.
3.4. CONSTRUCTING DIFFERENCING SCHEMES OF ANY ORDER 37
having the solution [a, b, c, d, e] = 41 [−3, 0, 3, 1, 1]. Under these conditions the sixth
column sums to
∆x4 ∂ 5 u
ert = (3.47)
120 ∂x5 j
∂ 2u 1
− Bp (−1, 16, −30, 16, −1)
u = O ∆x4
(3.55)
∂x2 12∆x2
∂eiκx
= iκeiκx (3.56)
∂x
uj+1 − uj−1
(δx u)j =
2∆x
eiκ∆x(j+1) − eiκ∆x(j−1)
=
2∆x
(e iκ∆x
− e−iκ∆x )eiκxj
=
2∆x
1
= [(cos κ∆x + i sin κ∆x) − (cos κ∆x − i sin κ∆x)]eiκxj
2∆x
sin κ∆x iκxj
= i e
∆x
= iκ∗ eiκxj (3.57)
40 CHAPTER 3. FINITE-DIFFERENCE APPROXIMATIONS
2.5
2
κ*∆ x
th
1.5 4 Pade
th
4 Central
1
nd
2 Central
0.5
0
0 0.5 1 1.5 2 2.5 3
κ∆ x
sin κ∆x
κ∗ = (3.58)
∆x
Note that κ∗ approximates κ to second-order accuracy, as is to be expected, since
1
(δxa u)j = [a1 (uj+1 − uj−1) + a2 (uj+2 − uj−2 ) + a3 (uj+3 − uj−3)]
∆x
and
1
(δxs u)j = [d0 uj + d1 (uj+1 + uj−1) + d2 (uj+2 + uj−2) + d3 (uj+3 + uj−3 )]
∆x
The corresponding modified wavenumber is
1
iκ∗ = [d0 + 2(d1 cos κ∆x + d2 cos 2κ∆x + d3 cos 3κ∆x)
∆x
+ 2i(a1 sin κ∆x + a2 sin 2κ∆x + a3 sin 3κ∆x) (3.59)
3
(δx u)j−1 + 4(δx u)j + (δx u)j+1 = (uj+1 − uj−1)
∆x
The modified wavenumber for this scheme satisfies6
3 iκ∆x
iκ∗ e−iκ∆x + 4iκ∗ + iκ∗ eiκ∆x = (e − e−iκ∆x )
∆x
which gives
3i sin κ∆x
iκ∗ =
(2 + cos κ∆x)∆x
The modified wavenumber provides a useful tool for assessing difference approx-
imations. In the context of the linear convection equation, the errors can be given
a physical interpretation. Consider once again the linear convection equation in the
form
∂u ∂u
+a =0
∂t ∂x
5
In terms of a circulant matrix operator A, the antisymmetric part is obtained from (A − AT )/2
and the symmetric part from (A + AT )/2.
6
Note that terms such as (δx u)j−1 are handled by letting (δx u)j = ik ∗ eiκj∆x and evaluating the
shift in j.
42 CHAPTER 3. FINITE-DIFFERENCE APPROXIMATIONS
u(x, t) = f (0)eiκ(x−at)
If second-order centered differences are applied to the spatial term, the following
ODE is obtained for f (t):
df sin κ∆x
= −ia f = −iaκ∗ f (3.61)
dt ∆x
Solving this ODE exactly (since we are considering the error from the spatial approx-
imation only) and substituting into Eq. 3.60, we obtain
∗ t)
unumerical (x, t) = f (0)eiκ(x−a (3.62)
where a∗ is the numerical (or modified) phase speed, which is related to the modified
wavenumber by
a∗ κ∗
=
a κ
For the above example,
a∗ sin κ∆x
=
a κ∆x
The numerical phase speed is the speed at which a harmonic function is propagated
numerically. Since a∗ /a ≤ 1 for this example, the numerical solution propagates
too slowly. Since a∗ is a function of the wavenumber, the numerical approximation
introduces dispersion, although the original PDE is nondispersive. As a result, a
waveform consisting of many different wavenumber components eventually loses its
original form.
Figure 3.5 shows the numerical phase speed for the schemes considered previously.
The number of points per wavelength (P P W ) by which a given wave is resolved is
given by 2π/κ∆x. The resolving efficiency of a scheme can be expressed in terms
of the P P W required to produce errors below a specified level. For example, the
second-order centered difference scheme requires 80 P P W to produce an error in
phase speed of less than 0.1 percent. The 5-point fourth-order centered scheme and
3.6. DIFFERENCE OPERATORS AT BOUNDARIES 43
1.2
0.8 nd
* 2 Central
a
_
a
0.6 th
4 Central
th
0.4 4 Pade
0.2
0
0 0.5 1 1.5 2 2.5 3
κ ∆x
where a, b, and c are constants which can easily be determined using a Taylor table,
as shown in Table 3.4.
∂2u 1 c ∂u
− 2 (auj−1 + buj ) + =?
∂x2 j ∆x ∆x ∂x j+1
· · ·
2 3 4
∆x·
∆x
2
∆x
3
∆x
4
∂u ∂ u ∂ u ∂ u
uj ∂x j ∂x2 j ∂x3 j ∂x4 j
∂2u
∆x2 · ∂x2 j
1
which produces the difference operator given in Eq. 3.29. Notice that this operator
is second-order accurate. In the case of a numerical approximation to a Neumann
boundary condition, this is necessary to obtain a globally second-order accurate for-
mulation. This contrasts with the numerical boundary schemes described previously
which can be one order lower than the interior scheme.
We can also obtain the operator in Eq. 3.77 using the space extrapolation idea.
Consider a second-order backward-difference approximation applied at node M + 1:
∂u 1
= (uM −1 − 4uM + 3uM +1 ) + O(∆x2 ) (3.78)
∂x M +1
2∆x
Solving for uM +1 gives
1 ∂u
uM +1 = 4uM − uM −1 + 2∆x + O(∆x3 ) (3.79)
3 ∂x M +1
Substituting this into the second-order centered difference operator for a second
derivative applied at node M gives
1
(δxx u)M = (uM +1 − 2uM + uM −1 ) (3.80)
∆x2
1 ∂u
= 2
3uM −1 − 6uM + 4uM − uM −1 + 2∆x
3∆x ∂x M +1
1 2 ∂u
= (2u M −1 − 2uM ) + (3.81)
3∆x2 3∆x ∂x M +1
3.7 Problems
1. Derive a third-order finite-difference approximation to a first derivative in the
form
1
(δx u)j = (auj−2 + buj−1 + cuj + duj+1)
∆x
Find the leading error term.
2. Derive a finite-difference approximation to a first derivative in the form
1
a(δx u)j−1 + (δx u)j = (buj−1 + cuj + duj+1)
∆x
Find the leading error term.
48 CHAPTER 3. FINITE-DIFFERENCE APPROXIMATIONS
3. Using a 4 (interior) point mesh, write out the 4×4 matrices and the boundary-
condition vector formed by using the scheme derived in question 2 when both
u and ∂u/∂x are given at j = 0 and u is given at j = 5.
1
(δxxx u)j = (auj−2 + buj−1 + cuj + duj+1 + euj+2 )
∆x3
Find the leading error term.
7. Find the modified wavenumber for the operator derived in question 1. Plot the
real and imaginary parts of κ∗ ∆x vs. κ∆x for 0 ≤ κ∆x ≤ π. Compare the real
part with that obtained from the fourth-order centered operator (Eq. 3.54).
∂ 2 eiκx
2
= −κ2 eiκx
∂x
Application of a difference operator for the second derivative gives
10. Consider the following one-sided differencing schemes, which are first-, second-,
and third-order, respectively:
Find the modified wavenumber for each of these schemes. Plot the real and
imaginary parts of κ∗ ∆x vs. κ∆x for 0 ≤ κ∆x ≤ π. Derive the two leading
terms in the truncation error for each scheme.
50 CHAPTER 3. FINITE-DIFFERENCE APPROXIMATIONS
Chapter 4
THE SEMI-DISCRETE
APPROACH
51
52 CHAPTER 4. THE SEMI-DISCRETE APPROACH
Clearly Eq. 4.2 is a difference equation which can be used at the space point j to
advance the value of u from the previous time levels n and n − 1 to the level n + 1.
It is a full discretization of the PDE. Note, however, that the spatial and temporal
discretizations are separable. Thus, this method has an intermediate semi-discrete
form and can be analyzed by the methods discussed in the next few chapters.
(n)
Another possibility is to replace the value of uj in the right hand side of Eq. 4.2
(n+1) (n−1)
with the time average of u at that point, namely (uj + uj )/2. This results in
the formula
(n+1) (n−1)
2hν (n) u + uj
(n+1) (n−1) j + uj−1
(n)
uj = uj + 2 uj+1 − 2 (4.3)
∆x 2
which can be solved for u(n+1) and time advanced at the point j. In this case, the
spatial and temporal discretizations are not separable, and no semi-discrete form
exists.
Equation 4.2 is sometimes called Richardson’s method of overlapping steps and
Eq. 4.3 is referred to as the DuFort-Frankel method. As we shall see later on, there
are subtle points to be made about using these methods to find a numerical solution
to the diffusion equation. There are a number of issues concerning the accuracy,
stability, and convergence of Eqs. 4.2 and 4.3 which we cannot comment on until we
develop a framework for such investigations. We introduce these methods here only to
distinguish between methods in which the temporal and spatial terms are discretized
separately and those for which no such separation is possible. For the time being, we
shall separate the space difference approximations from the time differencing. In this
approach, we reduce the governing PDE’s to ODE’s by discretizing the spatial terms
and use the well-developed theory of ODE solutions to aid us in the development of
an analysis of accuracy and stability.
du ν
= B(1, −2, 1)u + (bc) (4.4)
dt ∆x2
vector.
with Dirichlet boundary conditions folded into the (bc)
du a
=− Bp (−1, 0, 1)u (4.5)
dt 2∆x
where the boundary condition vector is absent because the flow is periodic.
Eqs. 4.4 and 4.5 are the model ODE’s for diffusion and biconvection of a scalar in
one dimension. They are linear with coefficient matrices which are independent of x
and t.
du
= Au − f (t) (4.6)
dt
Note that the elements in the matrix A depend upon both the PDE and the type of
differencing scheme chosen for the space terms. The vector f (t) is usually determined
by the boundary conditions and possibly source terms. In general, even the Euler
and Navier-Stokes equations can be expressed in the form of Eq. 4.6. In such cases
the equations are nonlinear, that is, the elements of A depend on the solution u and
are usually derived by finding the Jacobian of a flux vector. Although the equations
are nonlinear, the linear analysis presented in this book leads to diagnostics that are
surprisingly accurate when used to evaluate many aspects of numerical methods as
they apply to the Euler and Navier-Stokes equations.
54 CHAPTER 4. THE SEMI-DISCRETE APPROACH
Axm = λmxm or
[A − λm I]xm = 0 (4.7)
det[A − λI] = 0
We form the right-hand eigenvector matrix of a complete system by filling its columns
with the eigenvectors xm :
X = x1 , x2 . . . , xM
The inverse is the left-hand eigenvector matrix, and together they have the property
that
X −1 AX = Λ (4.8)
Defective Systems
If an M ×M matrix does not have a complete set of linearly independent eigenvectors,
it cannot be transformed to a diagonal matrix of scalars, and it is said to be defective.
It can, however, be transformed to a diagonal set of blocks, some of which may be
scalars (see Appendix A). In general, there exists some S which transforms any
matrix A such that
S −1 AS = J
where
J1
J2
..
J= .
Jm
...
and
λm 1 1
.. ..
λm . .
(n)
Jm = ..
...
1 .
λm n
The matrix J is said to be in Jordan canonical form, and an eigenvalue with multi-
plicity n within a Jordan block is said to be a defective eigenvalue. Defective systems
play a role in numerical stability analysis.
du
= λu + aeµt (4.9)
dt
where λ, a, and µ are scalars, all of which can be complex numbers. The equation
is linear because λ does not depend on u, and has a general solution because λ does
not depend on t. It has a steady-state solution if the right-hand side is independent
of t, i.e., if µ = 0, and is homogeneous if the forcing function is zero, i.e., if a = 0.
Although it is quite simple, the numerical analysis of Eq. 4.9 displays many of the
fundamental properties and issues involved in the construction and study of most
popular time-marching methods. This theme will be developed as we proceed.
56 CHAPTER 4. THE SEMI-DISCRETE APPROACH
aeµt
u(t) = c1 eλt +
µ−λ
λt eµt − eλt
u(t) = u(0)e + a
µ−λ
The interesting question can arise: What happens to the solution of Eq. 4.9 when
µ = λ? This is easily found by setting µ = λ + , solving, and then taking the limit
as → 0. Using this limiting device, we find that the solution to
du
= λu + aeλt (4.10)
dt
is given by
u(t) = [u(0) + at]eλt
As we shall soon see, this solution is required for the analysis of defective systems.
Second-Order Equations
The homogeneous form of a second-order equation is given by
d2 u du
2
+ a1 + a0 u = 0 (4.11)
dt dt
where a1 and a0 are complex constants. Now we introduce the differential operator
D such that
d
D≡
dt
and factor u(t) out of Eq. 4.11, giving
(D2 + a1 D + a0 ) u(t) = 0
· · ·, λM . They are found by solving the equation P (λ) = 0. In our simple example,
there would be two roots, λ1 and λ2 , determined from
P (λ) = λ2 + a1 λ + a0 = 0 (4.12)
and the solution to Eq. 4.11 is given by
u(t) = c1 eλ1 t + c2 eλ2 t (4.13)
where c1 and c2 are constants determined from initial conditions. The proof of this
is simple and is found by substituting Eq. 4.13 into Eq. 4.11. One finds the result
c1 eλ1 t (λ21 + a1 λ1 + a0 ) + c2 eλ2 t (λ22 + a1 λ2 + a0 )
which is identically zero for all c1 , c2 , and t if and only if the λ’s satisfy Eq. 4.12.
A Derogatory System
Eq. 4.15 is still a solution to Eq. 4.14 if λ1 = λ2 = λ, provided two linearly independent
vectors exist to satisfy Eq. 4.16 with A = Λ. In this case
λ 0 1 1 λ 0 0 0
=λ and =λ
0 λ 0 0 0 λ 1 1
provide such a solution. This is the case where A has a complete set of eigenvectors
and is not defective.
A Defective System
If A is defective, then it can be represented by the Jordan canonical form
u1 λ 0 u1
= (4.18)
u2 1 λ u2
whose solution is not obvious. However, in this case, one can solve the top equation
first, giving u1 (t) = u1 (0)eλt . Then, substituting this result into the second equation,
one finds
du2
= λu2 + u1 (0)eλt
dt
which is identical in form to Eq. 4.10 and has the solution
is a solution to
u1 λ u1
u2 = 1 λ u2
u3 1 λ u3
if
1
a = u3 (0) , b = u2 (0) , c = u1 (0) (4.19)
2
The general solution to such defective systems is left as an exercise.
4.2. EXACT SOLUTIONS OF LINEAR ODE’S 59
du
= Au − f (t) (4.20)
dt
Our assumption is that the M × M matrix A has a complete eigensystem1 and can
be transformed by the left and right eigenvector matrices, X−1 and X, to a diagonal
matrix Λ having diagonal elements which are the eigenvalues of A, see Section 4.2.1.
Now let us multiply Eq. 4.20 from the left by X −1 and insert the identity combination
XX −1 = I between A and u. There results
du
X −1 = X −1 AX · X −1u − X −1f(t) (4.21)
dt
Since A is independent of both u and t, the elements in X−1 and X are also indepen-
dent of both u and t, and Eq. 4.21 can be modified to
d −1
X u = ΛX −1u − X −1f(t)
dt
and g such that
Finally, by introducing the new variables w
= X −1u ,
w g(t) = X −1f (t) (4.22)
dw − g(t)
= Λw (4.23)
dt
It is important at this point to review the results of the previous paragraph. Notice
that Eqs. 4.20 and 4.23 are expressing exactly the same equality. The only difference
between them was brought about by algebraic manipulations which regrouped the
variables. However, this regrouping is crucial for the solution process because Eqs.
1
In the following, we exclude defective systems, not because they cannot be analyzed (the example
at the conclusion of the previous section proves otherwise), but because they are only of limited
interest in the general development of our theory.
60 CHAPTER 4. THE SEMI-DISCRETE APPROACH
4.23 are no longer coupled. They can be written line by line as a set of independent,
single, first-order equations, thus
w1 = λ1 w1 − g1 (t)
..
.
wm = λm wm − gm (t)
..
.
wM = λM wM − gM (t) (4.24)
For any given set of gm (t) each of these equations can be solved separately and then
recoupled, using the inverse of the relations given in Eqs. 4.22:
u(t) = X w(t)
M
= wm (t)xm (4.25)
m=1
(4.26)
Transient Steady-state
4.3. REAL SPACE AND EIGENSPACE 61
du
= Au − f (4.28)
dt
The dependent variable u represents some physical quantity or quantities which relate
to the problem of interest. For the model problems on which we are focusing most
of our attention, the elements of A are independent of both u and t. This permits
us to say a great deal about such problems and serves as the basis for this section.
In particular, we can develop some very important and fundamental concepts that
underly the global properties of the numerical solutions to the model problems. How
these relate to the numerical solutions of more practical problems depends upon the
problem and, to a much greater extent, on the cleverness of the relator.
We begin by developing the concept of “spaces”. That is, we identify different
mathematical reference frames (spaces) and view our solutions from within each.
In this way, we get different perspectives of the same solution, and this can add
significantly to our understanding.
The most natural reference frame is the physical one. We say
simplest nondefective case, of scalars. Following Section 4.2 and, for simplicity, using
only complete systems for our examples, we found that Eq. 4.28 had the alternative
form
dw
= Λw − g
dt
which is an uncoupled set of first-order ODE’s that can be solved independently for
the dependent variable vector w. We say
it is said to
If a solution is expressed in terms of w,
be in eigenspace (often referred to as wave space).
The relations that transfer from one space to the other are:
= X −1u
w u = X w
g = X −1f f = Xg
The elements of u relate directly to the local physics of the problem. However, the
are linear combinations of all of the elements of u, and individually
elements of w
they have no direct local physical interpretation.
When the forcing function f is independent of t, the solutions in the two spaces
are represented by
u(t) = cm eλm t xm + XΛ−1 X −1f
and
1
wm (t) = cm eλm t + gm ; m = 1, 2, · · · , M
λm
for real space and eigenspace, respectively. At this point we make two observations:
1. the transient portion of the solution in real space consists of a linear combination
of contributions from each eigenvector, and
= −iκ∗m a m = 0, 1, · · · , M − 1 (4.30)
where
sin κm ∆x
κ∗m = m = 0, 1, · · · , M − 1 (4.31)
∆x
is the modified wavenumber from Section 3.5, κm = m, and ∆x = 2π/M . Notice
that the diffusion eigenvalues are real and negative while those representing periodic
convection are all pure imaginary. The interpretation of this result plays a very
important role later in our stability analysis.
= X−1u
The rows of the matrix are proportional to the eigenvectors. In general w
gives
M
wm = uj sin mxj ; m = 1, 2, · · · , M (4.33)
j=1
In the field of harmonic analysis, Eq. 4.33 represents a sine transform of the func-
tion u(x) for an M -point sample between the boundaries x = 0 and x = π with the
condition u(0) = u(π) = 0. Similarly, Eq. 4.32 represents the sine synthesis that
companions the sine transform given by Eq. 4.33. In summary,
= X −1u
w is a sine transform from real space to (sine) wave
space.
u = X w
is a sine synthesis from wave space back to real
space.
xm = ei j (2πm/M ) , j = 0, 1, · · · , M − 1
m = 0, 1, · · · , M − 1
M −1
uj = wm eimxj ; j = 0, 1, · · · , M − 1 (4.34)
m=0
4.3. REAL SPACE AND EIGENSPACE 65
For a 4-point periodic mesh, we find the following left-hand eigenvector matrix
from Appendix B.4:
w1 1 1 1 1 u1
w
1 1 e −2iπ/4
e−4iπ/4 e −6iπ/4
2 u2
= −12iπ/4 = X −1
u
w3 4 1 e−4iπ/4 e−8iπ/4 e u3
w4 1 e−6iπ/4 e−12iπ/4 e−18iπ/4 u4
In general
−1
1 M
wm = uj e−imxj ; m = 0, 1, · · · , M − 1
M j=0
This equation is identical to a discrete Fourier transform of the periodic dependent
variable u using an M -point sample between and including x = 0 and x = 2π − ∆x.
For circulant matrices, it is straightforward to establish the fact that the relation
u = X w represents the Fourier synthesis of the variable w back to u. In summary,
= X −1u
w is a complex Fourier transform from real space to
wave space.
u = X w
is a complex Fourier synthesis from wave space
back to real space.
M
uj (t) = cm eλm t sin mxj + (A−1 f )j , j = 1, 2, · · · , M (4.35)
m=1
where
−4ν 2 mπ
λm = 2 sin (4.36)
∆x 2(M + 1)
66 CHAPTER 4. THE SEMI-DISCRETE APPROACH
M
∗ 2
uj (t) = cm e−νκm t sin κm xj + (A−1 f )j , j = 1, 2, · · · , M (4.38)
m=1
This can be compared with the exact solution to the PDE, Eq. 2.37, evaluated at the
nodes of the grid:
M
cm e−νκm t sin κm xj + h(xj ),
2
uj (t) = j = 1, 2, · · · , M (4.39)
m=1
We see that the solutions are identical except for the steady solution and the
modified wavenumber in the transient term. The modified wavenumber is an approx-
imation to the actual wavenumber. The difference between the modified wavenumber
and the actual wavenumber depends on the differencing scheme and the grid resolu-
tion. This difference causes the various modes (or eigenvector components) to decay
at rates which differ from the exact solution. With conventional differencing schemes,
low wavenumber modes are accurately represented, while high wavenumber modes (if
they have significant amplitudes) can have large errors.
M −1
uj (t) = cm eλm t eiκm xj , j = 0, 1, · · · , M − 1 (4.40)
m=0
where
λm = −iκ∗m a (4.41)
with the modified wavenumber defined in Eq. 4.31. We can write this ODE solution
as
M −1
∗
uj (t) = cm e−iκm at eiκm xj , j = 0, 1, · · · , M − 1 (4.42)
m=0
4.4. THE REPRESENTATIVE EQUATION 67
and compare it to the exact solution of the PDE, Eq. 2.26, evaluated at the nodes of
the grid:
M −1
uj (t) = fm (0)e−iκm at eiκm xj , j = 0, 1, · · · , M − 1 (4.43)
m=0
Once again the difference appears through the modified wavenumber contained in
λm . As discussed in Section 3.5, this leads to an error in the speed with which various
modes are convected, since κ∗ is real. Since the error in the phase speed depends on
the wavenumber, while the actual phase speed is independent of the wavenumber,
the result is erroneous numerical dispersion. In the case of non-centered differencing,
discussed in Chapter 11, the modified wavenumber is complex. The form of Eq. 4.42
shows that the imaginary portion of the modified wavenumber produces nonphysical
decay or growth in the numerical solution.
the question is: What should we use for gm (t) when the time dependence cannot be
ignored? To answer this question, we note that, in principle, one can express any one
of the forcing terms gm (t) as a finite Fourier series. For example
−g(t) = ak eikt
k
λt
ak eikt
w(t) = ce +
k ik − λ
From this we can extract the k’th term and replace ik with µ. This leads to
which can be used to evaluate all manner of time-marching methods. In such evalua-
tions the parameters λ and µ must be allowed to take the worst possible combination
of values that might occur in the ODE eigensystem. The exact solution of the repre-
sentative ODE is (for µ = λ):
aeµt
w(t) = ceλt + (4.46)
µ−λ
4.5 Problems
1. Consider the finite-difference operator derived in question 1 of Chapter 3. Using
this operator to approximate the spatial derivative in the linear convection equa-
tion, write the semi-discrete form obtained with periodic boundary conditions
on a 5-point grid (M = 5).
2. Consider the application of the operator given in Eq. 3.52 to the 1-D diffusion
equation with Dirichlet boundary conditions. Write the resulting semi-discrete
ODE form. Find the entries in the boundary-condition vector.
3. Write the semi-discrete form resulting from the application of second-order cen-
tered differences to the following equation on the domain 0 ≤ x ≤ 1 with
boundary conditions u(0) = 0, u(1) = 1:
∂u ∂2u
= 2 − 6x
∂t ∂x
4.5. PROBLEMS 69
corresponding to the ODE form of the biconvection equation resulting from the
application of second-order central differencing on a 10-point grid. Note that
the domain is 0 ≤ x ≤ 2π and ∆x = 2π/10. The grid nodes are given by
xj = j∆x, j = 0, 1, . . . 9. The eigenvalues of the above matrix A, as well as the
matrices X and X −1 , can be found from Appendix B.4. Using these, compute
and plot the ODE solution at t = 2π for the initial condition u(x, 0) = sin x.
Compare with the exact solution of the PDE. Calculate the numerical phase
speed from the modified wavenumber corresponding to this initial condition
and show that it is consistent with the ODE solution. Repeat for the initial
condition u(x, 0) = sin 2x.
70 CHAPTER 4. THE SEMI-DISCRETE APPROACH
Chapter 5
FINITE-VOLUME METHODS
d
QdV + n.FdS = P dV (5.1)
dt V (t) S(t) V (t)
We will begin by presenting the basic concepts which apply to finite-volume strategies.
Next we will give our model equations in the form of Eq. 5.1. This will be followed
by several examples which hopefully make these concepts clear.
71
72 CHAPTER 5. FINITE-VOLUME METHODS
1
Q̄ ≡ QdV (5.3)
V V
and Eq. 5.1 can be written as
d
V Q̄ + n.FdS = P dV (5.4)
dt S V
for a control volume which does not vary with time. Thus after applying a time-
marching method, we have updated values of the cell-averaged quantities Q̄. In order
to evaluate the fluxes, which are a function of Q, at the control-volume boundary, Q
can be represented within the cell by some piecewise approximation which produces
the correct value of Q̄. This is a form of interpolation often referred to as recon-
struction. As we shall see in our examples, each cell will have a different piecewise
approximation to Q. When these are used to calculate F(Q), they will generally
produce different approximations to the flux at the boundary between two control
volumes, that is, the flux will be discontinuous. A nondissipative scheme analogous
to centered differencing is obtained by taking the average of these two fluxes. Another
approach known as flux-difference splitting is described in Chapter 11.
The basic elements of a finite-volume method are thus the following:
1
Time-marching methods will be discussed in the next chapter.
5.2. MODEL EQUATIONS IN INTEGRAL FORM 73
2. Apply some strategy for resolving the discontinuity in the flux at the control-
volume boundary to produce a single value of F(Q) at any point on the bound-
ary. This issue is discussed in Section 11.4.2.
3. Integrate the flux to find the net flux through the control-volume boundary
using some sort of quadrature.
where the unit vector n points outward from the surface or contour.
∂u ∂u ∂u
+ a cos θ + a sin θ =0 (5.7)
∂t ∂x ∂y
This PDE governs a simple plane wave convecting the scalar quantity, u(x, y, t) with
speed a along a straight line making an angle θ with respect to the x-axis. The
one-dimensional form is recovered with θ = 0.
74 CHAPTER 5. FINITE-VOLUME METHODS
For unit speed a, the two-dimensional linear convection equation is obtained from
the general divergence form, Eq. 2.3, with
Q = u (5.8)
F = iu cos θ + ju sin θ (5.9)
P = 0 (5.10)
d
udA + n.(iu cos θ + ju sin θ)ds = 0 (5.11)
dt A C
where A is the area of the cell which is bounded by the closed contour C.
Q = u (5.12)
F = −∇u (5.13)
∂u ∂u
= − i +j (5.14)
∂x ∂y
P = 0 (5.15)
j-1/2 j+1/2
∆ x
Now with ξ = x − xj , we can expand u(x) in Eq. 5.19 in a Taylor series about xj
(with t fixed) to get
1 ∆x/2 ∂u ξ2 ∂2 u ξ3 ∂3u
ūj ≡ uj + ξ + + + . . . dξ
∆x −∆x/2 ∂x j
2 ∂x2 j
6 ∂x3 j
2 2 4 4
∆x ∂ u ∆x ∂ u
= uj + + + O(∆x6 ) (5.21)
24 ∂x2 j
1920 ∂x4 j
or
ūj = uj + O(∆x2 ) (5.22)
where uj is the value at the center of the cell. Hence the cell-average value and the
value at the center of the cell differ by a term of second order.
where the L indicates that this approximation to fj+1/2 is obtained from the approx-
imation to u(x) in the cell to the left of xj+1/2 , as shown in Fig. 5.1. The cell to the
right of xj+1/2 , which is cell j + 1, gives
R
fj+1/2 = ūj+1 (5.26)
We have now accomplished the first step from the list in Section 5.1; we have defined
the fluxes at the cell boundaries in terms of the cell-average data. In this example,
the discontinuity in the flux at the cell boundary is resolved by taking the average of
the fluxes on either side of the boundary. Thus
1 L 1
fˆj+1/2 = (fj+1/2 R
+ fj+1/2 ) = (ūj + ūj+1) (5.29)
2 2
and
1 L 1
fˆj−1/2 = (fj−1/2 R
+ fj−1/2 ) = (ūj−1 + ūj ) (5.30)
2 2
where fˆ denotes a numerical flux which is an approximation to the exact flux.
Substituting Eqs. 5.29 and 5.30 into the integral form, Eq. 5.23, we obtain
dūj 1 1 dūj 1
∆x + (ūj + ūj+1) − (ūj−1 + ūj ) = ∆x + (ūj+1 − ūj−1 ) = 0 (5.31)
dt 2 2 dt 2
With periodic boundary conditions, this point operator produces the following semi-
discrete form:
dū 1
=− Bp (−1, 0, 1)ū (5.32)
dt 2∆x
5.3. ONE-DIMENSIONAL EXAMPLES 77
1 ∆x/2
u(ξ)dξ = ūj (5.34)
∆x −∆x/2
1 3∆x/2
u(ξ)dξ = ūj+1
∆x ∆x/2
These constraints lead to
ūj+1 − 2ūj + ūj−1
a=
2∆x2
ūj+1 − ūj−1
b= (5.35)
2∆x
1
uR
j−1/2 = (−ūj+1 + 5ūj + 2ūj−1 ) (5.37)
6
1
uR
j+1/2 = (−ūj+2 + 5ūj+1 + 2ūj ) (5.38)
6
1
uLj−1/2 = (2ūj + 5ūj−1 − ūj−2) (5.39)
6
using the notation defined in Section 5.3.1. Recalling that f = u, we again use the
average of the fluxes on either side of the boundary to obtain
1
fˆj+1/2 = [f (uLj+1/2) + f (uR
j+1/2 )]
2
1
= (−ūj+2 + 7ūj+1 + 7ūj − ūj−1) (5.40)
12
and
1
fˆj−1/2 = [f (uLj−1/2) + f (uR
j−1/2 )]
2
1
= (−ūj+1 + 7ūj + 7ūj−1 − ūj−2) (5.41)
12
Substituting these expressions into the integral form, Eq. 5.23, gives
dūj 1
∆x + (−ūj+2 + 8ūj+1 − 8ūj−1 + ūj−2 ) = 0 (5.42)
dt 12
This is a fourth-order approximation to the integral form of the equation, as can be
verified using Taylor series expansions (see question 1 at the end of this chapter).
With periodic boundary conditions, the following semi-discrete form is obtained:
dū 1
=− Bp (1, −8, 0, 8, −1)ū (5.43)
dt 12∆x
This is a system of ODE’s governing the evolution of the cell-average data.
In one dimension, the integral form of the diffusion equation, Eq. 5.16, becomes
dūj
∆x + fj+1/2 − fj−1/2 = 0 (5.44)
dt
with f = −∇u = −∂u/∂x. Also, Eq. 5.6 becomes
b ∂u
dx = u(b) − u(a) (5.45)
a ∂x
We can thus write the following expression for the average value of the gradient of u
over the interval xj ≤ x ≤ xj+1 :
1 xj+1 ∂u 1
dx = (uj+1 − uj ) (5.46)
∆x xj ∂x ∆x
From Eq. 5.22, we know that the value of a continuous function at the center of a given
interval is equal to the average value of the function over the interval to second-order
accuracy. Hence, to second-order, we can write
∂u 1
fˆj+1/2 = − =− (ūj+1 − ūj ) (5.47)
∂x j+1/2
∆x
Similarly,
1
fˆj−1/2 = − (ūj − ūj−1) (5.48)
∆x
Substituting these into the integral form, Eq. 5.44, we obtain
dūj 1
∆x = (ūj−1 − 2ūj + ūj+1) (5.49)
dt ∆x
or, with Dirichlet boundary conditions,
dū 1
= B(1, −2, 1)
ū + bc (5.50)
dt ∆x2
This provides a semi-discrete finite-volume approximation to the diffusion equation,
and we see that the properties of the matrix B(1, −2, 1) are relevant to the study of
finite-volume methods as well as finite-difference methods.
For our second approach, we use a piecewise quadratic approximation as in Section
5.3.2. From Eq. 5.33 we have
∂u ∂u
= = 2aξ + b (5.51)
∂x ∂ξ
80 CHAPTER 5. FINITE-VOLUME METHODS
1
R
fj−1/2 L
= fj−1/2 =− (ūj − ūj−1 ) (5.53)
∆x
Notice that there is no discontinuity in the flux at the cell boundary. This produces
dūj 1
= (ūj−1 − 2ūj + ūj+1 ) (5.54)
dt ∆x2
which is identical to Eq. 5.49. The resulting semi-discrete form with periodic bound-
ary conditions is
dū 1
= Bp (1, −2, 1)ū (5.55)
dt ∆x2
which is written entirely in terms of cell-average data.
c b
p 1
d l a
4 ∆
0
5
e f
where we have ignored the source term. The contour in the line integral is composed
of the sides of the hexagon. Since these sides are all straight, the unit normals can
be taken outside the integral and the flux balance is given by
d
5
Q dA + nν · Fdl = 0
dt A ν=0 ν
where ν indexes a side of the hexagon, as shown in Figure 5.2. A list of the normals
for the mesh orientation shown is given in Table 5.1.
For Eq. 5.11, the two-dimensional linear convection equation, we have for side ν
/2
nν · Fdl = nν · (i cos θ + j sin θ) uν (ξ)dξ (5.58)
ν −/2
where ξ is a length measured from the middle of a side ν. Making the change of
variable z = ξ/, one has the expression
/2 1/2
u(ξ)dξ = u(z)dz (5.59)
−/2 −1/2
The values of nν · (i cos θ + j sin θ) are given by the expressions in Table 5.2. There
are no numerical approximations in Eq. 5.60. That is, if the integrals in the equation
are evaluated exactly, the integrated time rate of change of the integral of u over the
area of the hexagon is known exactly.
and the piecewise-constant approximation u = ūp over the entire hexagon, the ap-
proximation to the flux integral becomes trivial. Taking the average of the flux on
either side of each edge of the hexagon gives for edge 1:
ūp + ūa 1/2 ūp + ūa
u(z)dz = dz = (5.62)
1 2 −1/2 2
5.5. PROBLEMS 83
or
dūp 1 √
+ [(2 cos θ)(ūa − ūd ) + (cos θ + 3 sin θ)(ūb − ūe )
dt 3∆ √
+(− cos θ + 3 sin θ)(ūc − ūf )] = 0 (5.69)
The reader can verify, using Taylor series expansions, that this is a second-order
approximation to the integral form of the two-dimensional linear convection equation.
5.5 Problems
1. Use Taylor series to verify that Eq. 5.42 is a fourth-order approximation to Eq.
5.23.
2. Find the semi-discrete ODE form governing the cell-average data resulting from
the use of a linear approximation in developing a finite-volume method for the
linear convection equation. Use the following linear approximation:
u(ξ) = aξ + b
84 CHAPTER 5. FINITE-VOLUME METHODS
3. Using the first approach given in Section 5.3.3, derive a finite-volume approx-
imation to the spatial terms in the two-dimensional diffusion equation on a
square grid.
TIME-MARCHING METHODS
FOR ODE’S
After discretizing the spatial derivatives in the governing PDE’s (such as the Navier-
Stokes equations), we obtain a coupled system of nonlinear ODE’s in the form
du
= F (u, t) (6.1)
dt
These can be integrated in time using a time-marching method to obtain a time-
accurate solution to an unsteady flow problem. For a steady flow problem, spatial
discretization leads to a coupled system of nonlinear algebraic equations in the form
F (u) = 0 (6.2)
As a result of the nonlinearity of these equations, some sort of iterative method is
required to obtain a solution. For example, one can consider the use of Newton’s
method, which is widely used for nonlinear algebraic equations (See Section 6.10.3.).
This produces an iterative method in which a coupled system of linear algebraic
equations must be solved at each iteration. These can be solved iteratively using
relaxation methods, which will be discussed in Chapter 9, or directly using Gaussian
elimination or some variation thereof.
Alternatively, one can consider a time-dependent path to the steady state and use
a time-marching method to integrate the unsteady form of the equations until the
solution is sufficiently close to the steady solution. The subject of the present chapter,
time-marching methods for ODE’s, is thus relevant to both steady and unsteady flow
problems. When using a time-marching method to compute steady flows, the goal is
simply to remove the transient portion of the solution as quickly as possible; time-
accuracy is not required. This motivates the study of stability and stiffness, topics
which are covered in the next two chapters.
85
86 CHAPTER 6. TIME-MARCHING METHODS FOR ODE’S
6.1 Notation
Using the semi-discrete approach, we reduce our PDE to a set of coupled ODE’s
represented in general by Eq. 4.1. However, for the purpose of this chapter, we need
only consider the scalar case
du
= u = F (u, t) (6.3)
dt
Although we use u to represent the dependent variable, rather than w, the reader
should recall the arguments made in Chapter 4 to justify the study of a scalar ODE.
Our first task is to find numerical approximations that can be used to carry out the
time integration of Eq. 6.3 to some given accuracy, where accuracy can be measured
either in a local or a global sense. We then face a further task concerning the numerical
stability of the resulting methods, but we postpone such considerations to the next
chapter.
In Chapter 2, we introduced the convention that the n subscript, or the (n) su-
perscript, always points to a discrete time value, and h represents the time interval
∆t. Combining this notation with Eq. 6.3 gives
un = Fn = F (un, tn ) ; tn = nh
Often we need a more sophisticated notation for intermediate time steps involving
temporary calculations denoted by ũ, ū, etc. For these we use the notation
and
ũn+1 = un + hun
1
un+1 = [un + ũn+1 + hũn+1 ] (6.7)
2
According to the conditions presented under Eq. 6.4, the first and third of these are
examples of explicit methods. We refer to them as the explicit Euler method and the
MacCormack predictor-corrector method,1 respectively. The second is implicit and
referred to as the implicit (or backward) Euler method.
These methods are simple recipes for the time advance of a function in terms of its
value and the value of its derivative, at given time intervals. The material presented
in Chapter 4 develops a basis for evaluating such methods by introducing the concept
of the representative equation
du
= u = λu + aeµt (6.8)
dt
written here in terms of the dependent variable, u. The value of this equation arises
from the fact that, by applying a time-marching method, we can analytically convert
1
Here we give only MacCormack’s time-marching method. The method commonly referred to as
MacCormack’s method, which is a fully-discrete method, will be presented in Section 11.3
88 CHAPTER 6. TIME-MARCHING METHODS FOR ODE’S
such a linear ODE into a linear O∆E . The latter are subject to a whole body of
analysis that is similar in many respects to, and just as powerful as, the theory of
ODE’s. We next consider examples of this conversion process and then go into the
general theory on solving O∆E’s.
Apply the simple explicit Euler scheme, Eq. 6.5, to Eq. 6.8. There results
un+1 = un + h(λun + aeµhn )
or
un+1 − (1 + λh)un = haeµhn (6.9)
Eq. 6.9 is a linear O∆E , with constant coefficients, expressed in terms of the depen-
dent variable un and the independent variable n. As another example, applying the
implicit Euler method, Eq. 6.6, to Eq. 6.8, we find
un+1 = un + h λun+1 + aeµh(n+1)
or
(1 − λh)un+1 − un = heµh · aeµhn (6.10)
As a final example, the predictor-corrector sequence, Eq. 6.7, gives
ũn+1 − (1 + λh)un = aheµhn
1 1 1
− (1 + λh)ũn+1 + un+1 − un = aheµh(n+1) (6.11)
2 2 2
which is a coupled set of linear O∆E’s with constant coefficients. Note that the first
line in Eq. 6.11 is identical to Eq. 6.9, since the predictor step in Eq. 6.7 is simply
the explicit Euler method. The second line in Eq. 6.11 is obtained by noting that
ũn+1 = F (ũn+1, tn + h)
= λũn+1 + aeµh(n+1) (6.12)
Now we need to develop techniques for analyzing these difference equations so that
we can compare the merits of the time-marching methods that generated them.
Second-Order Equations
The homogeneous form of a second-order difference equation is given by
Instead of the differential operator D ≡ dtd used for ODE’s, we use for O∆E’s the
difference operator E (commonly referred to as the displacement or shift operator)
and defined formally by the relations
Further notice that the displacement operator also applies to exponents, thus
bα · bn = bn+α = E α · bn
90 CHAPTER 6. TIME-MARCHING METHODS FOR ODE’S
(E 2 + a1 E + a0 )un = 0 (6.15)
which must be zero for all un . Eq. 6.15 is known as the operational form of Eq. 6.14.
The operational form contains a characteristic polynomial P (E) which plays the same
role for difference equations that P (D) played for differential equations; that is, its
roots determine the solution to the O∆E. In the analysis of O∆E’s, we label these
roots σ1 , σ2 , · · ·, etc, and refer to them as the σ-roots. They are found by solving the
equation P (σ) = 0. In the simple example given above, there are just two σ roots
and in terms of them the solution can be written
where c1 and c2 depend upon the initial conditions. The fact that Eq. 6.16 is a
solution to Eq. 6.14 for all c1 , c2 and n should be verified by substitution.
Obviously the σk are the eigenvalues of C and, following the logic of Section 4.2,
if x are its eigenvectors, the solution of Eq. 6.17 is
2
un = ck (σk )nxk
k=1
A Defective System
The solution of O∆E’s with defective eigensystems follows closely the logic in Section
4.2.2 for defective ODE’s. For example, one can show that the solution to
ūn+1 σ ūn
ûn+1 = 1 σ ûn
un+1 1 σ un
is
ūn = ū0 σ n
ûn = û0 + ū0 nσ−1 σ n
−1 n(n − 1) −2 n
un = u0 + û0 nσ + ū0 σ σ (6.18)
2
E −(1 + λh) ũ 1
1 =h· · aeµhn (6.21)
− 2 (1 + λh)E E − 12 u n
1E
2
92 CHAPTER 6. TIME-MARCHING METHODS FOR ODE’S
All three of these equations are subsets of the operational form of the representative
O∆E
where σk are the K roots of the characteristic polynomial, P (σ) = 0. When determi-
nants are involved in the construction of P (E) and Q(E), as would be the case for
Eq. 6.21, the ratio Q(E)/P (E) can be found by Kramer’s rule. Keep in mind that
for methods such as in Eq. 6.21 there are multiple (two in this case) solutions, one
for un and ũn and we are usually only interested in the final solution un. Notice also,
the important subset of this solution which occurs when µ = 0, representing a time
invariant particular solution, or a steady state. In such a case
K
Q(1)
un = ck (σk )n + a ·
k=1 P (1)
P (E) = E − 1 − λh
Q(E) = h (6.24)
and the solution of its representative O∆E follows immediately from Eq. 6.23:
h
un = c1 (1 + λh)n + aeµhn ·
e µh
− 1 − λh
For the implicit Euler method, Eq. 6.20, we have
P (E) = (1 − λh)E − 1
Q(E) = hE (6.25)
6.5. THE λ − σ RELATION 93
so n
1 heµh
un = c1 + ae µhn
·
1 − λh (1 − λh)eµh − 1
In the case of the coupled predictor-corrector equations, Eq. 6.21, one solves for the
final family un (one can also find a solution for the intermediate family ũ), and there
results
E −(1 + λh) 1
P (E) = det 1 1 = E E − 1 − λh − λ2 h2
− 2 (1 + λh)E E−2 2
E h 1
Q(E) = det 1 hE = 2 hE(E + 1 + λh)
− 12 (1 + λh)E 2
The σ-root is found from
1
P (σ) = σ σ − 1 − λh − λ2 h2 = 0
2
which has only one nontrivial root (σ = 0 is simply a shift in the reference index).
The complete solution can therefore be written
1 µh
n h e + 1 + λh
1
un = c1 1 + λh + λ2 h2 + aeµhn · 2 (6.26)
2 1
eµh − 1 − λh − λ2 h2
2
where for the present we are not interested in the form of the particular solution
(P.S.). Now the explicit Euler method produces for each λ-root, one σ-root, which
is given by σ = 1 + λh. So if we use the Euler method for the time advance of the
ODE’s, the solution2 of the resulting O∆E is
where the cm and the xm in the two equations are identical and σm = (1 + λm h).
Comparing Eq. 6.27 and Eq. 6.28, we see a correspondence between σm and eλm h .
Since the value of eλh can be expressed in terms of the series
1 1 1
eλh = 1 + λh + λ2 h2 + λ3 h3 + · · · + λn hn + · · ·
2 6 n!
Applying Eq. 6.8 to Eq. 6.29, we have the characteristic polynomial P (E) = E2 −
2λhE − 1, so that for every λ the σ must satisfy the relation
2
σm − 2λm hσm − 1 = 0 (6.30)
Now we notice that each λ produces two σ-roots. For one of these we find
σm = λm h + 1 + λ2m h2 (6.31)
1 1
= 1 + λm h + λ2m h2 − λ4m h4 + · · · (6.32)
2 8
This
is an approximation to eλm h with an error O(λ3 h3 ). The other root, λm h −
1 + λ2m h2 , will be discussed in Section 6.5.3.
2
Based on Section 4.4.
3
The error is O(λ2 h2 ).
6.5. THE λ − σ RELATION 95
We refer to the root that has the above property as the principal σ-root, and designate
it (σm )1 . The above property can be stated regardless of the details of the time-
marching method, knowing only that its leading error is O hk+1 . Thus the principal
root is an approximation to eλh up to O hk .
Note that a second-order approximation to a derivative written in the form
1
(δt u)n = (un+1 − un−1 ) (6.34)
2h
has a leading truncation error which is O(h2 ), while the second-order time-marching
method which results from this approximation, which is the leapfrog method:
has a leading truncation error O(h3 ). This arises simply because of our notation
for the time-marching method in which we have multiplied through by h to get an
approximation for the function un+1 rather than the derivative as in Eq. 6.34. The
following example makes this clear. Consider a solution obtained at a given time T
using a second-order time-marching method with a time step h. Now consider the
solution obtained using the same method with a time step h/2. Since the error per
time step is O(h3 ), this is reduced by a factor of eight (considering the leading term
only). However, twice as many time steps are required to reach the time T . Therefore
the error at the end of the simulation is reduced by a factor of four, consistent with
a second-order approximation.
has the property given in 6.33. The other is referred to as a spurious σ-root and
designated (σm )2 . In general, the λ − σ relation produced by a time-marching scheme
can result in multiple σ-roots all of which, except for the principal one, are spurious.
All spurious roots are designated (σm )k where k = 2, 3, · · ·. No matter whether a
σ-root is principal or spurious, it is always some algebraic function of the product λh.
To express this fact we use the notation σ = σ(λh).
If a time-marching method produces spurious σ-roots, the solution for the O∆E in
the form shown in Eq. 6.28 must be modified. Following again the message of Section
4.4, we have
un = c11 (σ1 )n1 x1 + · · · + cm1(σm )n1 xm + · · · + cM 1 (σM )n1 xM + P.S.
+c12 (σ1 )n x1 + · · · + cm2 (σm )n xm + · · · + cM 2 (σM )n xM
2 2 2
+c13 (σ1 )n3
x1 + · · · + cm3 (σm )n xm + · · · + cM 3 (σM )n xM
3 3
+etc., if there are more spurious roots (6.36)
Spurious roots arise if a method uses data from time level n − 1 or earlier to
advance the solution from time level n to n + 1. Such roots originate entirely from
the numerical approximation of the time-marching method and have nothing to do
with the ODE being solved. However, generation of spurious roots does not, in itself,
make a method inferior. In fact, many very accurate methods in practical use for
integrating some forms of ODE’s have spurious roots.
It should be mentioned that methods with spurious roots are not self starting.
For example, if there is one spurious root to a method, all of the coefficients (cm )2
in Eq. 6.36 must be initialized by some starting procedure. The initial vector u0
does not provide enough data to initialize all of the coefficients. This results because
methods which produce spurious roots require data from time level n − 1 or earlier.
For example, the leapfrog method requires un−1 and thus cannot be started using
only un .
Presumably (i.e., if one starts the method properly) the spurious coefficients are
all initialized with very small magnitudes, and presumably the magnitudes of the
spurious roots themselves are all less than one (see Chapter 7). Then the presence of
spurious roots does not contaminate the answer. That is, after some finite time the
amplitude of the error associated with the spurious roots is even smaller then when
it was initialized. Thus while spurious roots must be considered in stability analysis,
they play virtually no role in accuracy analysis.
They have the significant advantage of being self-starting which carries with it the
very useful property that the time-step interval can be changed at will throughout
the marching process. Three one-root methods were analyzed in Section 6.4.2. A
popular method having this property, the so-called θ-method, is given by the formula
un+1 = un + h (1 − θ)un + θun+1
The θ-method represents the explicit Euler (θ = 0), the trapezoidal (θ = 12 ), and the
implicit Euler methods (θ = 1), respectively. Its λ − σ relation is
1 + (1 − θ)λh
σ=
1 − θλh
It is instructive to compare the exact solution to a set of ODE’s (with a complete
eigensystem) having time-invariant forcing terms with the exact solution to the O∆E’s
for one-root methods. These are
n n n
u(t) = c1 eλ1 h x1 + · · · + cm eλm h xm + · · · + cM eλM h xM + A−1f
un = c1 (σ1 )n x1 + · · · + cm (σm )n xm + · · · + cM (σM )n xM + A−1f (6.37)
respectively. Notice that when t and n = 0, these equations are identical, so that all
the constants, vectors, and matrices are identical except the u and the terms inside
the parentheses on the right hand sides. The only error made by introducing the time
marching is the error that σ makes in approximating eλh .
• evaluating numerical stability and separating the errors in phase and amplitude.
The latter three of these are of concern to us here, and to study them we make use of
the material developed in the previous sections of this chapter. Our error measures
are based on the difference between the exact solution to the representative ODE,
given by
aeµt
u(t) = ceλt + (6.38)
µ−λ
and the solution to the representative O∆E’s, including only the contribution from
the principal root, which can be written as
n Q(eµh )
un = c1 (σ1 ) + ae µhn
· (6.39)
P (eµh )
The particular choice of an error measure, either local or global, is to some extent
arbitrary. However, a necessary condition for the choice should be that the measure
can be used consistently for all methods. In the discussion of the λ-σ relation we
saw that all time-marching methods produce a principal σ-root for every λ-root that
exists in a set of linear ODE’s. Therefore, a very natural local error measure for the
transient solution is the value of the difference between solutions based on these two
roots. We designate this by erλ and make the following definition
erλ ≡ eλh − σ1
The leading error term can be found by expanding in a Taylor series and choosing
the first nonvanishing term. This is similar to the error found from a Taylor table.
The order of the method is the last power of λh matched exactly.
6.6. ACCURACY MEASURES OF TIME-MARCHING METHODS 99
Amplitude and phase errors are important measures of the suitability of time-marching
methods for convection and wave propagation phenomena.
The approach to error analysis described in Section 3.5 can be extended to the
combination of a spatial discretization and a time-marching method applied to the
linear convection equation. The principal root, σ1 (λh), is found using λ = −iaκ∗ ,
where κ∗ is the modified wavenumber of the spatial discretization. Introducing the
Courant number, Cn = ah/∆x, we have λh = −iCn κ∗ ∆x. Thus one can obtain
values of the principal root over the range 0 ≤ κ∆x ≤ π for a given value of the
Courant number. The above expression for erω can be normalized to give the error
in the phase speed, as follows
erω tan−1 [(σ1 )i /(σ1 )r )]
erp = =1+ (6.42)
ωh Cn κ∆x
where ω = −aκ. A positive value of erp corresponds to phase lag (the numerical phase
speed is too small), while a negative value corresponds to phase lead (the numerical
phase speed is too large).
and
Q(eµh )
P.S.(O∆E) = ae ·
µt
P (eµh )
respectively. For a measure of the local error in the particular solution we introduce
the definition
P.S.(O∆E)
erµ ≡ h −1 (6.43)
P.S.(ODE)
The multiplication by h converts the error from a global measure to a local one, so
that the order of erλ and erµ are consistent. In order to determine the leading error
term, Eq. 6.43 can be written in terms of the characteristic and particular polynomials
as
co ! "
erµ = · (µ − λ)Q eµh − P eµh (6.44)
µ−λ
where
h(µ − λ)
co = lim
h→0 P eµh
The value of co is a method-dependent constant that is often equal to one. If the
forcing function is independent of time, µ is equal to zero, and for this case, many
numerical methods generate an erµ that is also zero.
The algebra involved in finding the order of erµ can be quite tedious. However,
this order is quite important in determining the true order of a time-marching method
by the process that has been outlined. An illustration of this is given in the section
on Runge-Kutta methods.
The reader should be aware that this is not sufficient. For example, to derive all of
the necessary conditions for the fourth-order Runge-Kutta method presented later
in this chapter the derivation must be performed for a nonlinear ODE. However, the
analysis based on a linear nonhomogeneous ODE produces the appropriate conditions
for the majority of time-marching methods used in CFD.
T = Nh
If the event is periodic, we are more concerned with the global error in amplitude and
phase. These are given by
N
Era = 1 − (σ1 )2r + (σ1 )2i (6.49)
and
(σ1 )i
−1
Erω ≡ N ωh − tan
(σ1 )r
= ωT − N tan−1 [(σ1 )i /(σ1 )r ] (6.50)
102 CHAPTER 6. TIME-MARCHING METHODS FOR ODE’S
where the notation for F is defined in Section 6.1. The methods are said to be linear
because the α’s and β’s are independent of u and n, and they are said to be K-step
because K time-levels of data are required to marching the solution one time-step, h.
They are explicit if β1 = 0 and implicit otherwise.
When Eq. 6.51 is applied to the representative equation, Eq. 6.8, and the result is
expressed in operational form, one finds
1
1
αk E k un = h βk E k (λun + aeµhn ) (6.52)
k=1−K k=1−K
6.7. LINEAR MULTISTEP METHODS 103
We recall from Section 6.5.2 that a time-marching method when applied to the repre-
sentative equation must provide a σ-root, labeled σ1 , that approximates eλh through
the order of the method. The condition referred to as consistency simply means that
σ → 1 as h → 0, and it is certainly a necessary condition for the accuracy of any
time marching method. We can also agree that, to be of any value in time accuracy,
a method should at least be first-order accurate, that is σ → (1 + λh) as h → 0. One
can show that these conditions are met by any method represented by Eq. 6.51 if
αk = 0 and βk = (K + k − 1)αk
k k k
Since both sides of Eq. 6.51 can be multiplied by an arbitrary constant, these methods
are often “normalized” by requiring
βk = 1
k
6.7.2 Examples
There are many special explicit and implicit forms of linear multistep methods. Two
well-known families of them, referred to as Adams-Bashforth (explicit) and Adams-
Moulton (implicit), can be designed using the Taylor table approach of Section 3.4.
The Adams-Moulton family is obtained from Eq. 6.51 with
α1 = 1, α0 = −1, αk = 0, k = −1, −2, · · · (6.53)
The Adams-Bashforth family has the same α’s with the additional constraint that
β1 = 0. The three-step Adams-Moulton method can be written in the following form
un+1 = un + h(β1 un+1 + β0 un + β−1 un−1 + β−2 un−2 ) (6.54)
A Taylor table for Eq. 6.54 can be generated as
Table 6.1. Taylor table for the Adams-Moulton three-step linear multistep method.
104 CHAPTER 6. TIME-MARCHING METHODS FOR ODE’S
Explicit Methods
un+1 = un + hun Euler
un+1 = un−1 + 2hu
n Leapfrog
un+1 = un + 21 h 3un − un−1 AB2
un+1 h 23u − 16u + 5u
= un + 12 AB3
n n−1 n−2
Implicit Methods
un+1 = un + hun+1
Implicit Euler
un+1 1
= un + 2 h un + un+1 Trapezoidal (AM2)
un+1 = 13 4un − un−1 + 2hun+1 2nd-order Backward
un+1 h 5u + 8u − u
= un + 12 AM3
n+1 n n−1
4
Recall from Section 6.5.2 that a kth-order time-marching method has a leading truncation error
term which is O(hk+1 ).
6.7. LINEAR MULTISTEP METHODS 105
θ ξ ϕ Method Order
0 0 0 Euler 1
1 0 0 Implicit Euler 1
1/2 0 0 Trapezoidal or AM2 2
1 1/2 0 2nd Order Backward 2
3/4 0 −1/4 Adams type 2
1/3 −1/2 −1/3 Lees Type 2
1/2 −1/2 −1/2 Two–step trapezoidal 2
5/9 −1/6 −2/9 A–contractive 2
0 −1/2 0 Leapfrog 2
0 0 1/2 AB2 2
0 −5/6 −1/3 Most accurate explicit 3
1/3 −1/6 0 Third–order implicit 3
5/12 0 1/12 AM3 3
1/6 −1/2 −1/6 Milne 4
Table 6.2. Some linear one- and two-step methods, see Eq. 6.59.
One can show after a little algebra that both erµ and erλ are reduced to 0(h3 )
(i.e., the methods are 2nd-order accurate) if
1
ϕ=ξ−θ+
2
The class of all 3rd-order methods is determined by imposing the additional constraint
5
ξ = 2θ −
6
Finally a unique fourth-order method is found by setting θ = −ϕ = −ξ/3 = 16 .
106 CHAPTER 6. TIME-MARCHING METHODS FOR ODE’S
ũn+α = un + αhun
un+1 = un + h β ũn+α + γun (6.60)
Considering only local accuracy, one is led, by following the discussion in Section 6.6,
to the following observations. For the method to be second-order accurate both erλ
and erµ must be O(h3 ). For this to hold for erλ , it is obvious from Eq. 6.61 that
1
γ+β =1 ; αβ =
2
which provides two equations for three unknowns. The situation for erµ requires some
algebra, but it is not difficult to show using Eq. 6.44 that the same conditions also
make it O(h3 ). One concludes, therefore, that the predictor-corrector sequence
ũn+α = un + αhun
1 1 2α − 1
un+1 = un + h ũ + un (6.63)
2 α n+α α
is a second-order accurate method for any α.
5
Such as alternating direction, fractional-step, and hybrid methods.
6.9. RUNGE-KUTTA METHODS 107
to the Taylor series expansion of eλh out through the order of the method and then
truncates. Thus for a Runge-Kutta method of order k (up to 4th order), the principal
(and only) σ-root is given by
1 1
σ = 1 + λh + λ2 h2 + · · · + λk hk (6.69)
2 k!
It is not particularly difficult to build this property into a method, but, as we pointed
out in Section 6.6.4, it is not sufficient to guarantee k’th order accuracy for the solution
of u = F (u, t) or for the representative equation. To ensure k’th order accuracy, the
method must further satisfy the constraint that
erµ = O(hk+1 ) (6.70)
and this is much more difficult.
The most widely publicized Runge-Kutta process is the one that leads to the
fourth-order method. We present it below in some detail. It is usually introduced in
the form
k1 = hF (un , tn )
k2 = hF (un + βk1 , tn + αh)
k3 = hF (un + β1 k1 + γ1 k2 , tn + α1 h)
k4 = hF (un + β2 k1 + γ2 k2 + δ2 k3 , tn + α2 h)
followed by
u(tn + h) − u(tn ) = µ1 k1 + µ2 k2 + µ3 k3 + µ4 k4 (6.71)
However, we prefer to present it using predictor-corrector notation. Thus, a scheme
entirely equivalent to 6.71 is
u#n+α = un + βhun
ũn+α1 = un + β1 hun + γ1 hu#n+α
un+α2 = un + β2 hun + γ2 hu#n+α + δ2 hũn+α1
un+1 = un + µ1 hun + µ2 hu#n+α + µ3 hũn+α1 + µ4 hun+α2 (6.72)
Appearing in Eqs. 6.71 and 6.72 are a total of 13 parameters which are to be
determined such that the method is fourth-order according to the requirements in
Eqs. 6.69 and 6.70. First of all, the choices for the time samplings, α, α1 , and α2 , are
not arbitrary. They must satisfy the relations
α = β
α1 = β1 + γ1
α2 = β2 + γ2 + δ2 (6.73)
6.9. RUNGE-KUTTA METHODS 109
The algebra involved in finding algebraic equations for the remaining 10 parameters
is not trivial, but the equations follow directly from finding P (E) and Q(E) and then
satisfying the conditions in Eqs. 6.69 and 6.70. Using Eq. 6.73 to eliminate the β’s
we find from Eq. 6.69 the four conditions
µ1 + µ2 + µ3 + µ4 = 1 (1)
µ2 α + µ3 α1 + µ4 α2 = 1/2 (2)
(6.74)
µ3 αγ1 + µ4 (αγ2 + α1 δ2 ) = 1/6 (3)
µ4 αγ1 δ2 = 1/24 (4)
These four relations guarantee that the five terms in σ exactly match the first 5 terms
in the expansion of eλh . To satisfy the condition that erµ = O(k 5 ), we have to fulfill
four more conditions
µ2 α2 + µ3 α12 + µ4 α22 = 1/3 (3)
µ2 α3 + µ3 α13 + µ4 α23 = 1/4 (4)
(6.75)
µ3 α2 γ1 + µ4 (α2 γ2 + α12 δ2 ) = 1/12 (4)
µ3 αα1 γ1 + µ4 α2 (αγ2 + α1 δ2 ) = 1/8 (4)
The number in parentheses at the end of each equation indicates the order that
is the basis for the equation. Thus if the first 3 equations in 6.74 and the first
equation in 6.75 are all satisfied, the resulting method would be third-order accurate.
As discussed in Section 6.6.4, the fourth condition in Eq. 6.75 cannot be derived
using the methodology presented here, which is based on a linear nonhomogenous
representative ODE. A more general derivation based on a nonlinear ODE can be
found in several books.7
There are eight equations in 6.74 and 6.75 which must be satisfied by the 10
unknowns. Since the equations are overdetermined, two parameters can be set arbi-
trarily. Several choices for the parameters have been proposed, but the most popular
one is due to Runge. It results in the “standard” fourth-order Runge-Kutta method
expressed in predictor-corrector form as
1
u#n+1/2 = un + hun
2
1
ũn+1/2 = un + hu#n+1/2
2
un+1 = un + hũn+1/2
1
un+1 = un + h un + 2 u#n+1/2 + ũn+1/2 + un+1 (6.76)
6
7
The present approach based on a linear inhomogeneous equation provides all of the necessary
conditions for Runge-Kutta methods of up to third order.
110 CHAPTER 6. TIME-MARCHING METHODS FOR ODE’S
Notice that this represents the simple sequence of conventional linear multistep meth-
ods referred to, respectively, as
Euler Predictor
Euler Corrector
≡ RK4
Leapfrog Predictor
Milne Corrector
One can easily show that both the Burstein and the MacCormack methods given by
Eqs. 6.66 and 6.67 are second-order Runge-Kutta methods, and third-order methods
can be derived from Eqs. 6.72 by setting µ4 = 0 and satisfying only Eqs. 6.74 and the
first equation in 6.75. It is clear that for orders one through four, RK methods of order
k require k evaluations of the derivative function to advance the solution one time
step. We shall discuss the consequences of this in Chapter 8. Higher-order Runge-
Kutta methods can be developed, but they require more derivative evaluations than
their order. For example, a fifth-order method requires six evaluations to advance
the solution one step. In any event, storage requirements reduce the usefulness of
Runge-Kutta methods of order higher than four for CFD applications.
u = λu + aeµt (6.77)
using the implicit Euler method. Following the steps outlined in Section 6.2, we
obtained
where u and f are vectors and we still assume that A is not a function of u or t. Now
the equivalent to Eq. 6.78 is
or
The inverse is not actually performed, but rather we solve Eq. 6.81 as a linear system
of equations. For our one-dimensional examples, the system of equations which must
be solved is tridiagonal (e.g., for biconvection, A = −aBp (−1, 0, 1)/2∆x), and hence
its solution is inexpensive, but in multidimensions the bandwidth can be very large. In
general, the cost per time step of an implicit method is larger than that of an explicit
method. The primary area of application of implicit methods is in the solution of
stiff ODE’s, as we shall see in Chapter 8.
∂F ∂F
F (u, t) = F (un , tn ) + (u − un ) + (t − tn )
∂u n
∂t n
1 ∂2 F ∂2F
+ (u − un ) +
2
(u − un )(t − tn )
2 ∂u2 n
∂u∂t n
1 ∂2 F
+ (t − tn )2 + · · · (6.87)
2 ∂t2 n
On the other hand, the expansion of u(t) in terms of the independent variable t is
∂u 1 ∂2 u
u(t) = un + (t − tn ) + (t − tn )2 +··· (6.88)
∂t n
2 ∂t2 n
If t is within h of tn , both (t − tn )k and (u − un)k are O(hk ), and Eq. 6.87 can be
written
∂F ∂F
F (u, t) = Fn + (u − un) + (t − tn ) + O(h2 ) (6.89)
∂u n
∂t n
6.10. IMPLEMENTATION OF IMPLICIT METHODS 113
Notice that this is an expansion of the derivative of the function. Thus, relative to the
order of expansion of the function, it represents a second-order-accurate, locally-linear
approximation to F (u, t) that is valid in the vicinity of the reference station tn and
the corresponding un = u(tn ). With this we obtain the locally (in the neighborhood
of tn ) time-linear representation of Eq. 6.83, namely
du ∂F ∂F ∂F
= u + Fn − un + (t − tn ) + O(h2 ) (6.90)
dt ∂u n
∂u n
∂t n
1
un+1 = un + h[Fn+1 + Fn ] + hO(h2 ) (6.91)
2
where we write hO(h2 ) to emphasize that the method is second order accurate. Using
Eq. 6.89 to evaluate Fn+1 = F (un+1 , tn+1 ), one finds
1 ∂F ∂F
un+1 = un + h Fn + (un+1 − un ) + h 2
+ O(h ) + Fn
2 ∂u n
∂t n
+hO(h2 ) (6.92)
Note that the O(h2 ) term within the brackets (which is due to the local linearization)
is multiplied by h and therefore is the same order as the hO(h2 ) error from the
Trapezoidal Method. The use of local time linearization updated at the end of each
time step, and the trapezoidal time march, combine to make a second-order-accurate
numerical integration process. There are, of course, other second-order implicit time-
marching methods that can be used. The important point to be made here is that
local linearization updated at each time step has not reduced the order of accuracy of
a second-order time-marching process.
A very useful reordering of the terms in Eq. 6.92 results in the expression
1 ∂F 1 ∂F
1− h ∆un = hFn + h2 (6.93)
2 ∂u n
2 ∂t n
which is now in the delta form which will be formally introduced in Section 12.6. In
many fluid mechanic applications the nonlinear function F is not an explicit function
114 CHAPTER 6. TIME-MARCHING METHODS FOR ODE’S
of t. In such cases the partial derivative of F (u) with respect to t is zero and Eq. 6.93
simplifies to the second-order accurate expression
1 ∂F
1− h ∆un = hFn (6.94)
2 ∂u n
Notice that the RHS is extremely simple. It is the product of h and the RHS of
the basic equation evaluated at the previous time step. In this example, the basic
equation was the simple scalar equation 6.83, but for our applications, it is generally
the space-differenced form of the steady-state equation of some fluid flow problem.
A numerical time-marching procedure using Eq. 6.94 is usually implemented as
follows:
and save un.
1. Solve for the elements of hFn , store them in an array say R,
2. Solve for the elements of the matrix multiplying ∆un and store in some appro-
priate manner making use of sparseness or bandedness of the matrix if possible.
Let this storage area be referred to as B.
for ∆un . (Very seldom does one find B−1 in carrying out this step).
The solution for un+1 is generally stored such that it overwrites the value of un
and the process is repeated.
if we introduce Eq. 6.90 into this method, rearrange terms, and remove the explicit
dependence on time, we arrive at the form
∂F
1−h ∆un = hFn (6.96)
∂u n
6.10. IMPLEMENTATION OF IMPLICIT METHODS 115
We see that the only difference between the implementation of the trapezoidal method
and the implicit Euler method is the factor of 21 in the brackets of the left side of
Eqs. 6.94 and 6.96. Omission of this factor degrades the method in time accuracy by
one order of h. We shall see later that this method is an excellent choice for steady
problems.
Newton’s Method
Consider the limit h → ∞ of Eq. 6.96 obtained by dividing both sides by h and
setting 1/h = 0. There results
∂F
− ∆un = Fn (6.97)
∂u n
or
−1
∂F
un+1 = un − Fn (6.98)
∂u n
This is the well-known Newton method for finding the roots of a nonlinear equation
F (u) = 0. The fact that it has quadratic convergence is verified by a glance at Eqs.
6.87 and 6.88 (remember the dependence on t has been eliminated for this case). By
quadratic convergence, we mean that the error after a given iteration is proportional
to the square of the error at the previous iteration, where the error is the difference
between the current solution and the converged solution. Quadratic convergence is
thus a very powerful property. Use of a finite value of h in Eq. 6.96 leads to linear
convergence, i.e., the error at a given iteration is some multiple of the error at the
previous iteration. The reader should ponder the meaning of letting h → ∞ for the
trapezoidal method, given by Eq. 6.94.
First of all we reduce Eq. 6.99 to a set of first-order nonlinear equations by the
transformations
d2 f df
2
u1 =
, u2 = , u3 = f (6.100)
dt dt
This gives the coupled set of three nonlinear equations
u1 = F1 = −u1 u3 − β 1 − u22
u2 = F2 = u1
u3 = F3 = u2 (6.101)
and these can be represented in vector notation as
du
= F (u) (6.102)
dt
Now we seek to make the same local expansion that derived Eq. 6.90, except that
this time we are faced with a nonlinear vector function, rather than a simple nonlinear
scalar function. The required extension requires the evaluation of a matrix, called
the Jacobian matrix.8 Let us refer to this matrix as A. It is derived from Eq. 6.102
by the following process
∂F1 ∂F1 ∂F1
∂u1 ∂u2 ∂u3
∂F2 ∂F2 ∂F2
A= (6.104)
∂u1 ∂u2 ∂u3
∂F3 ∂F3 ∂F3
∂u1 ∂u2 ∂u3
The expansion of F (u) about some reference state u can be expressed in a way
n
similar to the scalar expansion given by eq 6.87. Omitting the explicit dependency
on the independent variable t, and defining F as F (u ), one has 9
n n
8
Recall that we derived the Jacobian matrices for the two-dimensional Euler equations in Section
2.2
9
The Taylor series expansion of a vector contains a vector for the first term, a matrix times a
vector for the second term, and tensor products for the terms of higher order.
6.11. PROBLEMS 117
(u) = F
F + A u − u + O(h2 ) (6.105)
n n n
where t − tn and the argument for O(h2 ) is the same as in the derivation of Eq. 6.88.
Using this we can write the local linearization of Eq. 6.102 as
du − A u +O(h2 )
= Anu + F n n n (6.106)
dt
“constant”
which is a locally-linear, second-order-accurate approximation to a set of coupled
nonlinear ordinary differential equations that is valid for t ≤ tn + h. Any first- or
second-order time-marching method, explicit or implicit, could be used to integrate
the equations without loss in accuracy with respect to order. The number of times,
and the manner in which, the terms in the Jacobian matrix are updated as the solution
proceeds depends, of course, on the nature of the problem.
Returning to our simple boundary-layer example, which is given by Eq. 6.101, we
find the Jacobian matrix to be
− u3 2βu2 −u1
A= 1 0 0 (6.107)
0 1 0
The student should be able to derive results for this example that are equivalent to
those given for the scalar case in Eq. 6.93. Thus for the Falkner-Skan equations the
trapezoidal method results in
1+ h h
2 (u3 )n −βh(u2 )n 2 (u1)n (∆u1 )n − (u1 u3 )n − β(1 − u22 )n
−h 2 1 0 (∆u2 )n = h
(u1 )n
0 −h 1 (∆u3 )n (u2 )n
2
We find un+1 from ∆un + un , and the solution is now advanced one step. Re-evaluate
the elements using un+1 and continue. Without any iterating within a step advance,
the solution will be second-order-accurate in time.
6.11 Problems
1. Find an expression for the nth term in the Fibonacci series, which is given
by 1, 1, 2, 3, 5, 8, . . . Note that the series can be expressed as the solution to a
difference equation of the form un+1 = un + un−1 . What is u25 ? (Let the first
term given above be u0 .)
118 CHAPTER 6. TIME-MARCHING METHODS FOR ODE’S
2. The trapezoidal method un+1 = un + 12 h(un+1 + un ) is used to solve the repre-
sentative ODE.
5. Find the difference equation which results from applying the Gazdag predictor-
corrector method (Eq. 6.65) to the representative equation. Find the λ-σ rela-
tion.
ũn+1/3 = un + hun /3
ūn+1/2 = un + hũn+1/3 /2
un+1 = un + hūn+1/2
6.11. PROBLEMS 119
Find the difference equation which results from applying this method to the
representative equation. Find the λ-σ relation. Find the solution to the differ-
ence equation, including the homogeneous and particular solutions. Find erλ
and erµ . What order is the homogeneous solution? What order is the particular
solution? Find the particular solution if the forcing term is fixed.
u(x, 0) = e−0.5[(x−0.5)/σ]
2
with σ = 0.08. Use the explicit Euler, 2nd-order Adams-Bashforth (AB2), im-
plicit Euler, trapezoidal, and 4th-order Runge-Kutta methods. For the explicit
Euler and AB2 methods, use a Courant number, ah/∆x, of 0.1; for the other
methods, use a Courant number of unity. Plot the solutions obtained at t = 1
compared to the exact solution (which is identical to the initial condition).
9. Using the computer program written for problem 7, compute the solution at
t = 1 using 2nd-order centered differences in space coupled with the 4th-order
Runge-Kutta method for grids of 100, 200, and 400 nodes. On a log-log scale,
plot the error given by
(
)
)
M
(uj − uexact )2
) j
*
j=1 M
where M is the number of grid nodes and uexact is the exact solution. Find the
global order of accuracy from the plot.
10. Using the computer program written for problem 8, repeat problem 9 using
4th-order (noncompact) differences in space.
11. Write a computer program to solve the one-dimensional linear convection equa-
tion with inflow-outflow boundary conditions and a = 1 on the domain 0 ≤
x ≤ 1. Let u(0, t) = sin ωt with ω = 10π. Run until a periodic steady state
is reached which is independent of the initial condition and plot your solution
120 CHAPTER 6. TIME-MARCHING METHODS FOR ODE’S
compared with the exact solution. Use 2nd-order centered differences in space
with a 1st-order backward difference at the outflow boundary (as in Eq. 3.69)
together with 4th-order Runge-Kutta time marching. Use grids with 100, 200,
and 400 nodes and plot the error vs. the number of grid nodes, as described in
problem 9. Find the global order of accuracy.
13. Using the approach described in Section 6.6.2, find the phase speed error, erp ,
and the amplitude error, era , for the combination of second-order centered dif-
ferences and 1st, 2nd, 3rd, and 4th-order Runge-Kutta time-marching at a
Courant number of unity. Also plot the phase speed error obtained using exact
integration in time, i.e., that obtained using the spatial discretization alone.
Note that the required σ-roots for the various Runge-Kutta methods can be
deduced from Eq. 6.69, without actually deriving the methods. Explain your
results.
Chapter 7
STABILITY OF LINEAR
SYSTEMS
A general definition of stability is neither simple nor universal and depends on the
particular phenomenon being considered. In the case of the nonlinear ODE’s of
interest in fluid dynamics, stability is often discussed in terms of fixed points and
attractors. In these terms a system is said to be stable in a certain domain if, from
within that domain, some norm of its solution is always attracted to the same fixed
point. These are important and interesting concepts, but we do not dwell on them in
this work. Our basic concern is with time-dependent ODE’s and O∆E ’s in which the
coefficient matrices are independent of both u and t; see Section 4.2. We will refer
to such matrices as stationary. Chapters 4 and 6 developed the representative forms
of ODE’s generated from the basic PDE’s by the semidiscrete approach, and then
the O∆E’s generated from the representative ODE’s by application of time-marching
methods. These equations are represented by
du
= Au − f (t) (7.1)
dt
and
un+1 = Cun − g n (7.2)
respectively. For a one-step method, the latter form is obtained by applying a time-
marching method to the generic ODE form in a fairly straightforward manner. For
example, the explicit Euler method leads to C = I + hA, and gn = f(nh). Methods
involving two or more steps can always be written in the form of Eq. 7.2 by introducing
new dependent variables. Note also that only methods in which the time and space
discretizations are treated separately can be written in an intermediate semi-discrete
form such as Eq. 7.1. The fully-discrete form, Eq. 7.2 and the associated stability
definitions and analysis are applicable to all methods.
121
122 CHAPTER 7. STABILITY OF LINEAR SYSTEMS
Our definitions of stability are based entirely on the behavior of the homogeneous
parts of Eqs. 7.1 and 7.2. The stability of Eq. 7.1 depends entirely on the eigensys-
tem1 of A. The stability of Eq. 7.2 can often also be related to the eigensystem of
its matrix. However, in this case the situation is not quite so simple since, in our
applications to partial differential equations (especially hyperbolic ones), a stability
definition can depend on both the time and space differencing. This is discussed in
Section 7.4. Analysis of these eigensystems has the important added advantage that
it gives an estimate of the rate at which a solution approaches a steady-state if a
system is stable. Consideration will be given to matrices that have both complete
and defective eigensystems, see Section 4.2.3, with a reminder that a complete system
can be arbitrarily close to a defective one, in which case practical applications can
make the properties of the latter appear to dominate.
If A and C are stationary, we can, in theory at least, estimate their fundamental
properties. For example, in Section 4.3.2, we found from our model ODE’s for dif-
fusion and periodic convection what could be expected for the eigenvalue spectrums
of practical physical problems containing these phenomena. These expectations are
referred to many times in the following analysis of stability properties. They are
important enough to be summarized by the following:
• For diffusion dominated flows the λ-eigenvalues tend to lie along the negative
real axis.
• For periodic convection-dominated flows the λ-eigenvalues tend to lie along the
imaginary axis.
In many interesting cases, the eigenvalues of the matrices in Eqs. 7.1 and 7.2
are sufficient to determine the stability. In previous chapters, we designated these
eigenvalues as λm and σm for Eqs. 7.1 and 7.2, respectively, and we will find it
convenient to examine the stability of various methods in both the complex λ and
complex σ planes.
1
This is not the case if the coefficient matrix depends on t even if it is linear.
7.2. INHERENT STABILITY OF ODE’S 123
For a stationary matrix A, Eq. 7.1 is inherently stable if, when f is (7.3)
constant, u remains bounded as t → ∞.
Note that inherent stability depends only on the transient solution of the ODE’s.
This states that, for inherent stability, all of the λ eigenvalues must lie on, or to the
left of, the imaginary axis in the complex λ plane. This criterion is satisfied for the
model ODE’s representing both diffusion and biconvection. It should be emphasized
(as it is an important practical consideration in convection-dominated systems) that
the special case for which λ = ±i is included in the domain of stability. In this case
it is true that u does not decay as t → ∞, but neither does it grow, so the above
condition is met. Finally we note that for ODE’s with complete eigensystems the
eigenvectors play no role in the inherent stability criterion.
u1 (t) = u1 (0)eλt
u2 (t) = [u2 (0) + u1 (0)t]eλt
1
u3 (t) = u3 (0) + u2 (0)t + u1 (0)t2 eλt (7.5)
2
Inspecting this solution, we see that for such cases condition 7.4 must be modified
to the form
since for pure imaginary λ, u2 and u3 would grow without bound (linearly or quadrat-
ically) if u2 (0) = 0 or u1(0) = 0. Theoretically this condition is sufficient for stability
in the sense of Statement 7.3 since tk e−||t → 0 as t → ∞ for all non-zero . However,
in practical applications the criterion may be worthless since there may be a very
large growth of the polynomial before the exponential “takes over” and brings about
the decay. Furthermore, on a computer such a growth might destroy the solution
process before it could be terminated.
Note that the stability condition 7.6 excludes the imaginary axis which tends to be
occupied by the eigenvalues related to biconvection problems. However, condition 7.6
is of little or no practical importance if significant amounts of dissipation are present.
For a stationary matrix C, Eq. 7.2 is numerically stable if, when g is (7.7)
constant, un remains bounded as n → ∞.
We see that numerical stability depends only on the transient solution of the O∆E ’s.
This definition of stability is sometimes referred to as asymptotic or time stability.
As we stated at the beginning of this chapter, stability definitions are not unique. A
definition often used in CFD literature stems from the development of PDE solutions
that do not necessarily follow the semidiscrete route. In such cases it is appropriate
to consider simultaneously the effects of both the time and space approximations. A
time-space domain is fixed and stability is defined in terms of what happens to some
norm of the solution within this domain as the mesh intervals go to zero at some
constant ratio. We discuss this point of view in Section 7.4.
7.4. TIME-SPACE STABILITY AND CONVERGENCE OF O∆E’S 125
This condition states that, for numerical stability, all of the σ eigenvalues (both
principal and spurious, if there are any) must lie on or inside the unit circle in the
complex σ-plane.
This definition of stability for O∆E ’s is consistent with the stability definition for
ODE’s. Again the sensitive case occurs for the periodic-convection model which places
the “correct” location of the principal σ–root precisely on the unit circle where the
solution is only neutrally stable. Further, for a complete eigensystem, the eigenvectors
play no role in the numerical stability assessment.
since defective systems do not guarantee boundedness for |σ| = 1, for example in Eq.
7.5 if |σ| = 1 and either u2(0) or u1 (0) = 0 we get linear or quadratic growth.
3. Time-march methods are developed which guarantee that |σ(λh)| ≤ 1 and this
is taken to be the condition for numerical stability.
126 CHAPTER 7. STABILITY OF LINEAR SYSTEMS
This does guarantee that a stationary system, generated from a PDE on some fixed
space mesh, will have a numerical solution that is bounded as t = nh → ∞. This
does not guarantee that desirable solutions are generated in the time march process
as both the time and space mesh intervals approach zero.
Now let us define stability in the time-space sense. First construct a finite time-
space domain lying within 0 ≤ x ≤ L and 0 ≤ t ≤ T . Cover this domain with a grid
that is equispaced in both time and space and fix the mesh ratio by the equation2
∆t
cn =
∆x
Next reduce our O∆E approximation of the PDE to a two-level (i.e., two time-planes)
formula in the form of Eq. 7.2. The homogeneous part of this formula is
Eq. 7.10 is said to be stable if any bounded initial vector, u0 , produces a bounded
solution vector, un , as the mesh shrinks to zero for a fixed cn . This is the classical
definition of stability. It is often referred to as Lax or Lax-Richtmyer stability. Clearly
as the mesh intervals go to zero, the number of time steps, N , must go to infinity in
order to cover the entire fixed domain, so the criterion in 7.7 is a necessary condition
for this stability criterion.
The significance of this definition of stability arises through Lax’s Theorem, which
states that, if a numerical method is stable (in the sense of Lax) and consistent then
it is convergent. A method is consistent if it produces no error (in the Taylor series
sense) in the limit as the mesh spacing and the time step go to zero (with cn fixed, in
the hyperbolic case). This is further discussed in Section 7.8. A method is convergent
if it converges to the exact solution as the mesh spacing and time step go to zero in
this manner.3 Clearly, this is an important property.
Applying simple recursion to Eq. 7.10, we find
un = Cnu0
and using vector and matrix p-norms (see Appendix A) and their inequality relations,
we have
Since the initial data vector is bounded, the solution vector is bounded if
||C|| ≤ 1 (7.12)
where ||C|| represents any p-norm of C. This is often used as a sufficient condition
for stability.
Now we need to relate the stability definitions given in Eqs. 7.8 and 7.9 with that
given in Eq. 7.12. In Eqs. 7.8 and 7.9, stability is related to the spectral radius of
C, i.e., its eigenvalue of maximum magnitude. In Eq. 7.12, stability is related to a
p-norm of C. It is clear that the criteria are the same when the spectral radius is a
true p-norm.
Two facts about the relation between spectral radii and matrix norms are well
known:
1. The spectral radius of a matrix is its L2 norm when the matrix is normal, i.e.,
it commutes with its transpose.
Furthermore, when C is normal, the second inequality in Eq. 7.11 becomes an equal-
ity. In this case, Eq. 7.12 becomes both necessary and sufficient for stability. From
these relations we draw two important conclusions about the numerical stability of
methods used to solve PDE’s.
• The stability criteria in Eqs. 7.8 and 7.12 are identical for stationary systems
when the governing matrix is normal. This includes symmetric, asymmetric,
and circulant matrices. These criteria are both necessary and sufficient for
methods that generate such matrices and depend solely upon the eigenvalues of
the matrices.
• If the spectral radius of any governing matrix is greater than one, the method
is unstable by any criterion. Thus for general matrices, the spectral radius
condition is necessary4 but not sufficient for stability.
4
Actually the necessary condition is that the spectral radius of C be less than or equal to 1 +
O(∆t), but this distinction is not critical for our purposes here.
128 CHAPTER 7. STABILITY OF LINEAR SYSTEMS
where the σm are the eigenvalues of C. If the semi-discrete approach is used, we can
find a relation between the σ and the λ eigenvalues. This serves as a very convenient
guide as to where we might expect the σ-roots to lie relative to the unit circle in the
complex σ-plane. For this reason we will proceed to trace the locus of the σ-roots
as a function of the parameter λh for the equations modeling diffusion and periodic
convection6 .
Figure 7.1 shows the exact trace of the σ-root if it is generated by eλh representing
either diffusion or biconvection. In both cases the • represents the starting value
where h = 0 and σ = 1. For the diffusion model, λh is real and negative. As the
magnitude of λh increases, the trace representing the dissipation model heads towards
the origin as λh =→ −∞. On the other hand, for the biconvection model, λh = iωh
is always imaginary. As the magnitude of ωh increases, the trace representing the
biconvection model travels around the circumference of the unit circle, which it never
leaves. We must be careful in interpreting σ when it is representing eiωh . The fact
that it lies on the unit circle means only that the amplitude of the representation is
correct, it tells us nothing of the phase error (see Eq. 6.41). The phase error relates
to the position on the unit circle.
5
The subject of defective eigensystems has been addressed. From now on we will omit further
discussion of this special case.
6
Or, if you like, the parameter h for fixed values of λ equal to -1 and i for the diffusion and
biconvection cases, respectively.
7.5. NUMERICAL STABILITY CONCEPTS IN THE COMPLEX σ-PLANE 129
Now let us compare the exact σ-root traces with some that are produced by actual
time-marching methods. Table 7.1 shows the λ-σ relations for a variety of methods.
Figures 7.2 and 7.3 illustrate the results produced by various methods when they are
applied to the model ODE’s for diffusion and periodic-convection, Eqs. 4.4 and 4.5.
It is implied that the behavior shown is typical of what will happen if the methods
are applied to diffusion- (or dissipation-) dominated or periodic convection-dominated
problems as well as what does happen in the model cases. Most of the important
possibilities are covered by the illustrations.
1 σ − 1 − λh = 0 Explicit Euler
2 σ 2 − 2λhσ − 1 = 0 Leapfrog
3 2 3 1
σ − (1 + 2 λh)σ + 2 λh = 0 AB2
4 σ 3 − (1 + 23 16 5
12 λh)σ + 12 λhσ − 12 λh = 0
2
AB3
5 σ 1 − λh) − 1 = 0
(
Implicit Euler
6 1 1
σ(1 − 2 λh) − (1 + 2 λh) = 0 Trapezoidal
7 σ 2 (1 − 23 λh) − 43 σ + 31 = 0 2nd O Backward
8 5 8 1
σ (1 − 12 λh) − (1 + 12 λh)σ + 12 λh = 0
2
AM3
9 σ 2 − (1 + 13 15 2 2 1 5
12 λh + 24 λ h )σ + 12 λh(1 + 2 λh) = 0 ABM3
10 σ 3 − (1 + 2λh)σ2 + 23 λhσ − 12 λh = 0 Gazdag
11 1
σ − 1 − λh − 2 λ h = 0
2 2
RK2
12 1 2 2 1 3 3 1
σ − 1 − λh − 2 λ h − 6 λ h − 24 λ h = 0 4 4
RK4
13 σ 2 (1 − 13 λh) − 43 λhσ − (1 + 13 λh) = 0 Milne 4th
Figure 7.2 shows results for the explicit Euler method. When used for dissipation-
dominated cases it is stable for the range -2≤ λh ≤0. (Usually the magnitude of λ has
to be estimated and often it is found by trial and error). When used for biconvection
the σ- trace falls outside the unit circle for all finite h, and the method has no range
of stability in this case.
130 CHAPTER 7. STABILITY OF LINEAR SYSTEMS
Ι(σ) Ι(σ)
λ h= 0 ωh= 0
R (σ) R (σ)
λ h= οο
i ωh
σ = e λ h , λh - oo σ=e , ωh oo
a) Dissipation b) Convection
b. Leapfrog Method
This is a two-root method, since there are two σ’s produced by every λ. When
applied to dissipation dominated problems we see from Fig. 7.2 that the principal
root is stable for a range of λh, but the spurious root is not. In fact, the spurious
root starts on the unit circle and falls outside of it for all (λh) < 0. However, for
biconvection cases, when λ is pure imaginary, the method is not only stable, but it
also produces a σ that falls precisely on the unit circle in the range 0 ≤ ωh ≤ 1.
As was pointed out above, this does not mean, that the method is without error.
Although the figure shows that there is a range of ωh in which the leapfrog method
produces no error in amplitude, it says nothing about the error in phase. More is said
about this in Chapter 8.
σ1
σ1
λ h=-2
a) Euler Explicit
ωh = 1
σ2 σ1
σ2 σ1
b) Leapfrog
σ1
σ2 σ1 σ2
λ h=-1
c) AB2
Diffusion Convection
λ-root is pure imaginary. In that case as ωh increases away from zero the spurious
root remains inside the circle and remains stable for a range of ωh. However, the
principal root falls outside the unit circle for all ωh > 0, and for the biconvection
model equation the method is unstable for all h.
d. Trapezoidal Method
The trapezoidal method is a very popular one for reasons that are partially illustrated
in Fig. 7.3. Its σ-roots fall on or inside the unit circle for both the dissipating and
the periodic convecting case and, in fact, it is stable for all values of λh for which
λ itself is inherently stable. Just like the leapfrog method it has the capability of
132 CHAPTER 7. STABILITY OF LINEAR SYSTEMS
producing only phase error for the periodic convecting case, but there is a major
difference between the two since the trapezoidal method produces no amplitude error
for any ωh — not just a limited range between 0 ≤ ωh ≤ 1.
e. Gazdag Method
The Gazdag method was designed to produce low phase error. Since its characteristic
polynomial for σ is a cubic (Table 7.1, no. 10), it must have two spurious roots in
addition to the principal one. These are shown in Fig. 7.3. In both the dissipation
and biconvection cases, a spurious root limits the stability. For the dissipating case,
a spurious root leaves the unit circle when λh < −21 , and for the biconvecting case,
when ωh > 23 . Note that both spurious roots are located at the origin when λ = 0.
Mild instability
All conventional time-marching methods produce a principal root that is very close
to eλh for small values of λh. Therefore, on the basis of the principal root, the
stability of a method that is required to resolve a transient solution over a relatively
short time span may be a moot issue. Such cases are typified by the AB2 and RK2
methods when they are applied to a biconvection problem. Figs. 7.2c and 7.3f show
that for both methods the principal root falls outside the unit circle and is unstable
for all ωh. However, if the transient solution of interest can be resolved in a limited
number of time steps that are small in the sense of the figure, the error caused by this
instability may be relatively unimportant. If the root had fallen inside the circle the
7.5. NUMERICAL STABILITY CONCEPTS IN THE COMPLEX σ-PLANE 133
σ1
σ1
σ3 σ1
σ1
σ2 σ3
e) Gazdag
σ1 σ1
λ h=-2 f) RK2
σ1 σ1
g) RK4
λ h=-2.8 ωh= 2 2
Diffusion Convection
method would have been declared stable but an error of the same magnitude would
have been committed, just in the opposite direction. For this reason the AB2 and the
RK2 methods have both been used in serious quantitative studies involving periodic
convection. This kind of instability is referred to as mild instability and is not a
serious problem under the circumstances discussed.
Catastrophic instability
There is a much more serious stability problem for small h that can be brought about
by the existence of certain types of spurious roots. One of the best illustrations of this
kind of problem stems from a critical study of the most accurate, explicit, two-step,
linear multistep method (see Table 7.1):
un+1 = −4un + 5un−1 + 2h 2un + un−1 (7.13)
One can show, using the methods given in Section 6.6, that this method is third-
order accurate both in terms of erλ and erµ , so from an accuracy point of view it is
attractive. However, let us inspect its stability even for very small values of λh. This
can easily be accomplished by studying its characteristic polynomial when λh → 0.
From Eq. 7.13 it follows that for λh = 0, P (E) = E 2 + 4E − 5. Factoring P (σ) = 0
we find P (σ) = (σ − 1)(σ + 5) = 0. There are two σ-roots; σ1 , the principal one,
equal to 1, and σ2 , a spurious one, equal to -5!!
In order to evaluate the consequences of this result, one must understand how
methods with spurious roots work in practice. We know that they are not self start-
ing, and the special procedures chosen to start them initialize the coefficients of the
spurious roots, the cmk for k > 1 in Eq. 6.36. If the starting process is well designed
these coefficients are forced to be very small, and if the method is stable, they get
smaller with increasing n. However, if the magnitude of one of the spurious σ is
equal to 5, one can see disaster is imminent because (5)10 ≈ 107 . Even a very small
initial value of cmk is quickly overwhelmed. Such methods are called catastrophically
unstable and are worthless for most, if not all, computations.
unstable for some complex λ as h proceeds away from zero. On the basis of a Taylor
series expansion, however, these methods are generally the most accurate insofar as
they minimize the coefficient in the leading term for ert .
The latter type is referred to as an Adams Method. Since for these methods
all spurious methods start at the origin for h = 0, they have a guaranteed range of
stability for small enough h. However, on the basis of the magnitude of the coefficient
in the leading Taylor series error term, they suffer, relatively speaking, from accuracy.
For a given amount of computational work, the order of accuracy of the two
types is generally equivalent, and stability requirements in CFD applications generally
override the (usually small) increase in accuracy provided by a coefficient with lower
magnitude.
• One has inherently stable, coupled systems with λ–eigenvalues having widely
separated magnitudes.
or
• We seek only to find a steady-state solution using a path that includes the
unwanted transient.
In both of these cases there exist in the eigensystems relatively large values of
|λh| associated with eigenvectors that we wish to drive through the solution process
without any regard for their individual accuracy in eigenspace. This situation is
136 CHAPTER 7. STABILITY OF LINEAR SYSTEMS
the major motivation for the study of numerical stability. It leads to the subject of
stiffness discussed in the next chapter.
θ ξ ϕ Method Order
1 0 0 Implicit Euler 1
1/2 0 0 Trapezoidal 2
1 1/2 0 2nd O Backward 2
3/4 0 −1/4 Adams type 2
1/3 −1/2 −1/3 Lees 2
1/2 −1/2 −1/2 Two-step trapezoidal 2
5/8 −1/6 −2/9 A-contractive 2
Notice that none of these methods has an accuracy higher than second-order. It can
be proved that the order of an A-stable LMM cannot exceed two, and, furthermore
7.6. NUMERICAL STABILITY CONCEPTS IN THE COMPLEX λH PLANE137
that of all 2nd-order A-stable methods, the trapezoidal method has the smallest
truncation error.
Returning to the stability test using positive real functions one can show that a
two-step LMM is Ao -stable if and only if
1
θ ≥ϕ+ (7.17)
2
1
ξ≥− (7.18)
2
0≤θ−ϕ (7.19)
For first-order accuracy, the inequalities 7.17 to 7.19 are less stringent than 7.14 to
7.16. For second-order accuracy, however, the parameters (θ, ξ, ϕ) are related by the
condition
1
ϕ=ξ−θ+
2
and the two sets of inequalities reduce to the same set which is
ξ ≤ 2θ − 1 (7.20)
1
ξ≥− (7.21)
2
Hence, two-step, second-order accurate LMM’s that are A-stable and Ao -stable share
the same (ϕ, ξ, θ) parameter space. Although the order of accuracy of an A-stable
method cannot exceed two, Ao -stable LMM methods exist which have an accuracy of
arbitrarily high order.
It has been shown that for a method to be I-stable it must also be A-stable.
Therefore, no further discussion is necessary for the special case of I-stability.
It is not difficult to prove that methods having a characteristic polynomial for
which the coefficient of the highest order term in E is unity7 can never be uncondi-
tionally stable. This includes all explicit methods and predictor-corrector methods
made up of explicit sequences. Such methods are referred to, therefore, as condition-
ally stable methods.
I( λ h) I(λ h) I(λ h)
Stable
Unstable
1 1
Stable Unstable
Stable
R(λ h) R(λ h) Unstable R( λ h)
contour goes through the point λh = 0. Here |σ| refers to the maximum absolute
value of any σ, principal or spurious, that is a root to the characteristic polynomial for
a given λh. It follows from Section 7.3 that on one side of this contour the numerical
method is stable while on the other, it is unstable. We refer to it, therefore, as a
stability contour.
Typical stability contours for both explicit and implicit methods are illustrated in
Fig. 7.4, which is derived from the one-root θ-method given in Section 6.5.4.
I( λh)
3.0
Stable RK4
Regions
RK3
2.0
RK2
1.0
RK1
R(λ h)
I(λh)
I(λh)
Stable Outside Stable Outside
2.0 2.0
Figure 7.7: Stability contours for the 2 unconditionally stable implicit methods.
method is stable for the entire range of complex λh that fall outside the boundary.
This means that the method is numerically stable even when the ODE’s that it is
being used to integrate are inherently unstable. Some other implicit unconditionally
stable methods with the same property are shown in Fig. 7.7. In all of these cases
the imaginary axis is part of the stable region.
Not all unconditionally stable methods are stable in some regions where the ODE’s
they are integrating are inherently unstable. The classic example of a method that
is stable only when the generating ODE’s are themselves inherently stable is the
trapezoidal method, i.e., the special case of the θ-method for which θ = 12 . The
stability boundary for this case is shown in Fig. 7.4b. The boundary is the imaginary
axis and the numerical method is stable for λh lying on or to the left of this axis.
Two other methods that have this property are the two-step trapezoidal method
un+1 = un−1 + h un+1 + un−1
and a method due to Lees
2
un+1 = un−1 + h un+1 + un + un−1
3
Notice that both of these methods are of the Milne type.
I(λh)
I(λh)
3.0
3.0
Stable
only on 2.0
2.0 Imag Axis
Stable 1.0
1.0
Figure 7.8: Stability contours for the 2 conditionally stable implicit methods.
is a solution to the difference equation, where κ is real and κ∆x lies in the range
0 ≤ κ∆x ≤ π. Since, for the general term,
(n+) (n)
uj+m = eα(t+∆t) · eiκ(x+m∆x) = eα∆t · eiκm∆x · uj
8
Another way of viewing this is to consider it as an initial value problem on an infinite space
domain.
142 CHAPTER 7. STABILITY OF LINEAR SYSTEMS
(n)
the quantity uj is common to every term and can be factored out. In the remaining
expressions, we find the term eα∆t which we represent by σ, thus:
σ ≡ eα∆t
n
Then, since eαt = eα∆t = σ n , it is clear that
and the problem is to solve for the σ’s produced by any given method and, as a
necessary condition for stability, make sure that, in the worst possible combination
of parameters, condition 7.23 is satisfied9 .
1 + O(∆t).
7.8. CONSISTENCY 143
7.8 Consistency
Consider the model equation for diffusion analysis
∂u ∂2u
=ν 2 (7.27)
∂t ∂x
144 CHAPTER 7. STABILITY OF LINEAR SYSTEMS
Many years before computers became available (1910, in fact), Lewis F. Richardson
proposed a method for integrating equations of this type. We presented his method
in Eq. 4.2 and analyzed its stability by the Fourier method in Section 7.7.
In Richardson’s time, the concept of numerical instability was not known. How-
ever, the concept is quite clear today and we now know immediately that his approach
would be unstable. As a semi-discrete method it can be expressed in matrix notation
as the system of ODE’s:
du ν
= B(1, −2, 1)u + (bc) (7.28)
dt ∆x2
with the leapfrog method used for the time march. Our analysis in this Chapter
revealed that this is numerically unstable since the λ-roots of B(1,-2,1) are all real
and negative and the spurious σ-root in the leapfrog method is unstable for all such
cases, see Fig. 7.5b.
The method was used by Richardson for weather prediction, and this fact can now
be a source of some levity. In all probability, however, the hand calculations (the
only approach available at the time) were not carried far enough to exhibit strange
phenomena. We could, of course, use the 2nd-order Runge-Kutta method to integrate
Eq. 7.28 since it is stable for real negative λ’s. It is, however, conditionally stable
and for this case we are rather severely limited in time step size by the requirement
∆t ≤ ∆x2 /(2ν).
There are many ways to manipulate the numerical stability of algorithms. One of
them is to introduce mixed time and space differencing, a possibility we have not yet
considered. For example, we introduced the DuFort-Frankel method in Chapter 4:
(n+1) (n−1)
2ν∆t (n) u + uj
(n+1) (n−1) j + uj+1
(n)
uj = uj + 2 uj−1 − 2 (7.29)
∆x 2
in which the central term in the space derivative in Eq. 4.2 has been replaced by its
average value at two different time levels. Now let
2ν∆t
α≡
∆x2
and rearrange terms
(n+1) (n−1) (n) (n)
(1 + α)uj = (1 − α)uj + α uj−1 + uj+1
There is no obvious ODE between the basic PDE and this final O∆E. Hence, there
is no intermediate λ-root structure to inspect. Instead one proceeds immediately to
the σ-roots.
7.8. CONSISTENCY 145
The simplest way to carry this out is by means of the Fourier stability analysis
introduced in Section 7.7. This leads at once to
(1 + α)σ = (1 − α)σ−1 + α eiκ∆x + e−iκ∆x
or
(1 + α)σ2 − 2ασ cos(k∆x) − (1 − α) = 0
The solution of the quadratic is
α cos κ∆x ± 1 − α2 sin2 κ∆x
σ=
1+α
There are 2M σ-roots all of which are ≤ 1 for any real α in the range 0 ≤ α ≤ ∞.
This means that the method is unconditionally stable!
The above result seems too good to be true, since we have found an unconditionally
stable method using an explicit combination of Lagrangian interpolation polynomials.
Table 7.3: Summary of accuracy and consistency conditions for RK2 and
Du Fort-Frankel methods. = an arbitrary error bound.
7.9 Problems
1. Consider the ODE
du
u = = Au + f
dt
with
−10 −0.1 −0.1 −1
A= 1 −1 1
, f = 0
10 1 −1 0
7.9. PROBLEMS 147
(a) Find the eigenvalues of A using a numerical package. What is the steady-
state solution? How does the ODE solution behave in time?
(b) Write a code to integrate from the initial condition u(0) = [1, 1, 1]T from
time t = 0 using the explicit Euler, implicit Euler, and MacCormack meth-
ods. In all three cases, use h = 0.1 for 1000 time steps, h = 0.2 for 500
time steps, h = 0.4 for 250 time steps and h = 1.0 for 100 time steps.
Compare the computed solution with the exact steady solution.
(c) Using the λ-σ relations for these three methods, what are the expected
bounds on h for stability? Are your results consistent with these bounds?
2. (a) Compute a table of the numerical values of the σ-roots of the 2nd-order
Adams-Bashforth method when λ = i. Take h in intervals of 0.05 from 0
to 0.80 and compute the absolute values of the roots to at least 6 places.
(b) Plot the trace of the roots in the complex σ-plane and draw the unit circle
on the same plot.
(c) Repeat the above for the RK2 method.
3. When applied to the linear convection equation, the widely known Lax–Wendroff
method gives:
1 1
un+1
j = unj − Cn (unj+1 − unj−1) + Cn2 (unj+1 − 2unj + unj−1)
2 2
where Cn , known as the Courant (or CFL) number, is ah/∆x. Using Fourier
stability analysis, find the range of Cn for which the method is stable.
4. Determine and plot the stability contours for the Adams-Bashforth methods of
order 1 through 4. Compare with the Runge-Kutta methods of order 1 through
4.
3
(δx u)j−1 + 4(δx u)j + (δx u)j+1 = (uj+1 − uj−1)
∆x
By replacing the spatial index j by the temporal index n, obtain a time-marching
method using this formula. What order is the method? Is it explicit or implicit?
Is it a two-step LMM? If so, to what values of ξ, θ, and φ (in Eq. 6.59) does it
correspond? Derive the λ-σ relation for the method. Is it A-stable, Ao -stable,
or I-stable?
148 CHAPTER 7. STABILITY OF LINEAR SYSTEMS
6. Write the ODE system obtained by applying the 2nd-order centered difference
approximation to the spatial derivative in the model diffusion equation with
periodic boundary conditions. Using Appendix B.4, find the eigenvalues of the
spatial operator matrix. Given that the λ-σ relation for the 2nd-order Adams-
Bashforth method is
σ 2 − (1 + 3λh/2)σ + λh/2 = 0
show that the maximum stable value of |λh| for real negative λ, i.e., the point
where the stability contour intersects the negative real axis, is obtained with
λh = −1. Using the eigenvalues of the spatial operator matrix, find the maxi-
mum stable time step for the combination of 2nd-order centered differences and
the 2nd-order Adams-Bashforth method applied to the model diffusion equa-
tion. Repeat using Fourier analysis.
9. Consider the linear convection with a positive wave speed as in problem 8. Apply
a Dirichlet boundary condition at the left boundary. No boundary condition is
permitted at the right boundary. Write the system of ODE’s which results from
first-order backward spatial differencing in matrix-vector form. Using Appendix
B.1, find the λ-eigenvalues. Write the O∆E which results from the application
of explicit Euler time marching in matrix-vector form, i.e.,
Write C in banded matrix notation and give the entries of g. Using the λ-
σ relation for the explicit Euler method, find the σ-eigenvalues. Based on
these, what is the maximum Courant number allowed for asymptotic stability?
Explain why this differs from the answer to problem 8. Hint: is C normal?
Chapter 8
CHOICE OF TIME-MARCHING
METHODS
149
150 CHAPTER 8. CHOICE OF TIME-MARCHING METHODS
I(λ h)
Stable
Region
λh
1 R(λ h)
Accurate
Region
Figure 8.1: Stable and accurate regions for the explicit Euler method.
the homogeneous part, we exclude the forcing function from further discussion in this
section.
Consider now the form of the exact solution of a system of ODE’s with a com-
plete eigensystem. This is given by Eq. 6.27 and its solution using a one-root, time-
marching method is represented by Eq. 6.28. For a given time step, the time integra-
tion is an approximation in eigenspace that is different for every eigenvector xm . In
many numerical applications the eigenvectors associated with the small |λm | are well
resolved and those associated with the large |λm | are resolved much less accurately,
if at all. The situation is represented in the complex λh plane in Fig. 8.1. In this
figure the time step has been chosen so that time accuracy is given to the eigenvectors
associated with the eigenvalues lying in the small circle and stability without time
accuracy is given to those associated with the eigenvalues lying outside of the small
circle but still inside the large circle.
The whole concept of stiffness in CFD arises from the fact that we often do
not need the time resolution of eigenvectors associated with the large |λm | in
the transient solution, although these eigenvectors must remain coupled into
the system to maintain a high accuracy of the spatial resolution.
8.1. STIFFNESS DEFINITION FOR ODE’S 151
Cr = |λM | / |λp |
and form the categories
Mildly-stiff Cr < 102
Strongly-stiff 103 < Cr < 105
Extremely-stiff 106 < Cr < 108
Pathologically-stiff 109 < Cr
It should be mentioned that the gaps in the stiff category definitions are intentional
because the bounds are arbitrary. It is important to notice that these definitions
make no distinction between real, complex, and imaginary eigenvalues.
152 CHAPTER 8. CHOICE OF TIME-MARCHING METHODS
problems, but in general the stiffness of CFD problems is proportional to the mesh
intervals in the manner shown above where the critical interval is the smallest one in
the physical domain.
2. Implications of computer storage capacity and access time are ignored. In some
contexts, this can be an important consideration.
T = Nh
• The error in u at the end of the event, i.e., the global error, must be < 0.5%.
We judge the most efficient method as the one that satisfies these conditions and
has the fewest number of evaluations, Fev . Three methods are compared — explicit
Euler, AB2, and RK4.
First of all, the allowable error constraint means that the global error in the am-
plitude, see Eq. 6.48, must have the property:
+ +
+ Erλ +
+ + < 0.005
+ λT +
e
8.4. COMPARING THE EFFICIENCY OF EXPLICIT METHODS 155
where σ1 is found from the characteristic polynomials given in Table 7.1. The results
shown in Table 8.1 were computed using a simple iterative procedure.
In this example we see that, for a given global accuracy, the method with the
highest local accuracy is the most efficient on the basis of the expense in evaluating
Fev . Thus the second-order Adams-Bashforth method is much better than the first-
order Euler method, and the fourth-order Runge-Kutta method is the best of all. The
main purpose of this exercise is to show the (usually) great superiority of second-order
over first-order time-marching methods.
Under these conditions a method is judged as best when it has the highest global
accuracy for resolving eigenvectors with imaginary eigenvalues. The above constraint
has led to the invention of schemes that omit the function evaluation in the cor-
rector step of a predictor-corrector combination, leading to the so-called incomplete
156 CHAPTER 8. CHOICE OF TIME-MARCHING METHODS
predictor-corrector methods. The presumption is, of course, that more efficient meth-
ods will result from the omission of the second function evaluation. An example is
the method of Gazdag, given in Section 6.8. Basically this is composed of an AB2
predictor and a trapezoidal corrector. However, the derivative of the fundamental
family is never found so there is only one evaluation required to complete each cycle.
The λ-σ relation for the method is shown as entry 10 in Table 7.1.
In order to discuss our comparisions we introduce the following definitions:
• Let a k-evaluation method be defined as one that requires k evaluations of
F (u, t) to advance one step using that method’s time interval, h.
• Let K represent the total number of allowable Fev .
• Let h1 be the time interval advanced in one step of a one-evaluation method.
The Gazdag, leapfrog, and AB2 schemes are all 1-evaluation methods. The second
and fourth order RK methods are 2- and 4-evaluation methods, respectively. For a 1-
evaluation method the total number of time steps, N , and the number of evaluations,
K, are the same, one evaluation being used for each step, so that for these methods
h = h1 . For a 2-evaluation method N = K/2 since two evaluations are used for
each step. However, in this case, in order to arrive at the same time T after K
evaluations, the time step must be twice that of a one–evaluation method so h = 2h1 .
For a 4-evaluation method the time interval must be h = 4h1 , etc. Notice that
as k increases, the time span required for one application of the method increases.
However, notice also that as k increases, the power to which σ1 is raised to arrive
at the final destination decreases; see the Figure below. This is the key to the true
comparison of time-march methods for this type of problem.
0 T uN
k=1 | • • • • • • • | [σ(λh1 )]8
k=2 | 2h1 • • • | [σ(2λh1 )]4
k=4 | 4h1 • | [σ(4λh1 )]2
Step sizes and powers of σ for k-evaluation methods used to get to the same value
of T if 8 evaluations are allowed.
In general, after K evaluations, the global amplitude and phase error for k-
evaluation methods applied to systems with pure imaginary λ-roots can be written1
Era = 1 − |σ1 (ikωh1 )|K/k (8.4)
1
See Eqs. 6.38 and 6.39.
8.4. COMPARING THE EFFICIENCY OF EXPLICIT METHODS 157
K [σ1 (ikωh1 )]imaginary
Erω = ωT − tan−1 (8.5)
k [σ1 (ikωh1 )]real
Table 8.2: Comparison of global amplitude and phase errors for four methods.
2
The σ1 root for the Gazdag method can be found using a numerical root finding routine to trace
the three roots in the σ-plane, see Fig. 7.3e.
158 CHAPTER 8. CHOICE OF TIME-MARCHING METHODS
Using analysis such as this (and also considering the stability boundaries) the
RK4 method is recommended as a basic first choice for any explicit time-accurate
calculation of a convection-dominated problem.
We will assume that our accuracy requirements are such that sufficient accuracy is
obtained as long as |λh| ≤ 0.1. This defines a time step limit based on accuracy
considerations of h = 0.001 for λ1 and h = 0.1 for λ2 . The time step limit based
on stability, which is determined from λ1 , is h = 0.02. We will also assume that
c1 = c2 = 1 and that an amplitude less than 0.001 is negligible. We first run 66
time steps with h = 0.001 in order to resolve the λ1 term. With this time step the
8.5. COPING WITH STIFFNESS 159
λ2 term is resolved exceedingly well. After 66 steps, the amplitude of the λ1 term
(i.e., (1 − 100h)n ) is less than 0.001 and that of the λ2 term (i.e., (1 − h)n ) is 0.9361.
Hence the λ1 term can now be considered negligible. To drive the (1 − h)n term to
zero (i.e., below 0.001), we would like to change the step size to h = 0.1 and continue.
We would then have a well resolved answer to the problem throughout the entire
relevant time interval. However, this is not possible because of the coupled presence
of (1 − 100h)n , which in just 10 steps at h = 0.1 amplifies those terms by ≈ 109 , far
outweighing the initial decrease obtained with the smaller time step. In fact, with
h = 0.02, the maximum step size that can be taken in order to maintain stability,
about 339 time steps have to be computed in order to drive e−t to below 0.001. Thus
the total simulation requires 405 time steps.
In order to resolve the initial transient of the term e−100t , we need to use a step size
of about h = 0.001. This is the same step size used in applying the explicit Euler
method because here accuracy is the only consideration and a very small step size
must be chosen to get the desired resolution. (It is true that for the same accuracy
we could in this case use a larger step size because this is a second-order method,
but that is not the point of this exercise). After 70 time steps the λ1 term has
amplitude less than 0.001 and can be neglected. Now with the implicit method we
can proceed to calculate the remaining part of the event using our desired step size
h = 0.1 without any problem of instability, with 69 steps required to reduce the
amplitude of the second term to below 0.001. In both intervals the desired solution
is second-order accurate and well resolved. It is true that in the final 69 steps one
σ-root is [1 − 50(0.1)]/[1 + 50(0.1)] = 0.666 · · ·, and this has no physical meaning
whatsoever. However, its influence on the coupled solution is negligible at the end
of the first 70 steps, and, since (0.666 · · ·)n < 1, its influence in the remaining 70
160 CHAPTER 8. CHOICE OF TIME-MARCHING METHODS
steps is even less. Actually, although this root is one of the principal roots in the
system, its behavior for t > 0.07 is identical to that of a stable spurious root. The
total simulation requires 139 time steps.
8.5.3 A Perspective
It is important to retain a proper perspective on a problem represented by the above
example. It is clear that an unconditionally stable method can always be called upon
to solve stiff problems with a minimum number of time steps. In the example, the
conditionally stable Euler method required 405 time steps, as compared to about
139 for the trapezoidal method, about three times as many. However, the Euler
method is extremely easy to program and requires very little arithmetic per step. For
preliminary investigations it is often the best method to use for mildly-stiff diffusion
dominated problems. For refined investigations of such problems an explicit method
of second order or higher, such as Adams-Bashforth or Runge-Kutta methods, is
recommended. These explicit methods can be considered as effective mildly stiff-
stable methods. However, it should be clear that as the degree of stiffness of the
problem increases, the advantage begins to tilt towards implicit methods, as the
reduced number of time steps begins to outweigh the increased cost per time step.
The reader can repeat the above example with λ1 = −10, 000, λ2 = −1, which is in
the strongly-stiff category.
There is yet another technique for coping with certain stiff systems in fluid dynamic
applications. This is known as the multigrid method. It has enjoyed remarkable
success in many practical problems; however, we need an introduction to the theory
of relaxation before it can be presented.
un = c11 (σ1 )n1 x1 + · · · + cm1(σm )n1 xm + · · · + cM 1 (σM )n1 xM + P.S.
+c12 (σ1 )n x1 + · · · + cm2 (σm )n xm + · · · + cM 2 (σM )n xM
2 2 2
x1 + · · · + cm3 (σm )n xm + · · · + cM 3 (σM )n xM
+c13 (σ1 )n3 3 3
+etc., if there are more spurious roots (8.8)
When solving a steady problem, we have no interest whatsoever in the transient por-
tion of the solution. Our sole goal is to eliminate it as quickly as possible. Therefore,
8.7. PROBLEMS 161
the choice of a time-marching method for a steady problem is similar to that for a
stiff problem, the difference being that the order of accuracy is irrelevant. Hence the
explicit Euler method is a candidate for steady diffusion dominated problems, and
the fourth-order Runge-Kutta method is a candidate for steady convection dominated
problems, because of their stability properties. Among implicit methods, the implicit
Euler method is the obvious choice for steady problems.
When we seek only the steady solution, all of the eigenvalues can be considered to
be parasitic. Referring to Fig. 8.1, none of the eigenvalues are required to fall in the
accurate region of the time-marching method. Therefore the time step can be chosen
to eliminate the transient as quickly as possible with no regard for time accuracy.
For example, when using the implicit Euler method with local time linearization, Eq.
6.96, one would like to take the limit h → ∞, which leads to Newton’s method, Eq.
6.98. However, a finite time step may be required until the solution is somewhat close
to the steady solution.
8.7 Problems
1. Repeat the time-march comparisons for diffusion (Section 8.4.2) and periodic
convection (Section 8.4.3) using 2nd- and 3rd-order Runge-Kutta methods.
2. Repeat the time-march comparisons for diffusion (Section 8.4.2) and periodic
convection (Section 8.4.3) using the 3rd- and 4th-order Adams-Bashforth meth-
ods. Considering the stability bounds for these methods (see problem 4 in
Chapter 7) as well as their memory requirements, compare and contrast them
with the 3rd- and 4th-order Runge-Kutta methods.
RELAXATION METHODS
F (u) = 0 (9.2)
In the latter case, the unsteady equations are integrated until the solution converges
to a steady solution. The same approach permits a time-marching method to be used
to solve a linear system of algebraic equations in the form
Ax = b (9.3)
163
164 CHAPTER 9. RELAXATION METHODS
The common feature of all time-marching methods is that they are at least first-
order accurate. In this chapter, we consider iterative methods which are not time
accurate at all. Such methods are known as relaxation methods. While they are
applicable to coupled systems of nonlinear algebraic equations in the form of Eq. 9.2,
our analysis will focus on their application to large sparse linear systems of equations
in the form
Abu − f b = 0 (9.5)
where Ab is nonsingular, and the use of the subscript b will become clear shortly. Such
systems of equations arise, for example, at each time step of an implicit time-marching
method or at each iteration of Newton’s method. Using an iterative method, we seek
to obtain rapidly a solution which is arbitrarily close to the exact solution of Eq. 9.5,
which is given by
u∞ = A−1f b (9.6)
b
CAbu − C f b = 0 (9.7)
The matrix in Eq. 9.9 has eigenvalues whose imaginary parts are much larger than
their real parts. It can first be conditioned so that the modulus of each element is 1.
This is accomplished using a diagonal preconditioning matrix
1 0 0 0 0
0 1 0 0 0
D= 2∆x
0 0 1 0 0
(9.10)
0 0 0 1 0
1
0 0 0 0 2
which scales each row. We then further condition with multiplication by the negative
transpose. The result is
A2 = −AT1 A1 =
0 1 0 1
−1 0 1 −1 0 1
−1 0 1 −1 0 1
−1 0 1 −1 0 1
−1 −1 −1 1
−1 0 1
0 −2 0 1
= 1 0 −2 0 1 (9.11)
1 0 −2 1
1 1 −2
166 CHAPTER 9. RELAXATION METHODS
If we define a permutation matrix P 1 and carry out the process P T [−AT1 A1 ]P (which
just reorders the elements of A1 and doesn’t change the eigenvalues) we find
0 1 0 0 0 −1 0 1 0 0 0 0 1
0 0 0 1 0
0 −2 0 1
1 0 0 0 0
0 0 0 0 1 1 0 −2 0 1 0 0 0 1 0
0 0 1 0 0 1 0 −2 1 0 1 0 0 0
1 0 0 0 0 1 1 −2 0 0 1 0 0
−2 1
1 −2 1
=
1 −2 1
(9.12)
1 −2 1
1 −1
which has all negative real eigenvalues, as given in Appendix B. Thus even when
the basic matrix Ab has nearly imaginary eigenvalues, the conditioned matrix −ATb Ab
is nevertheless symmetric negative definite (i.e., symmetric with negative real eigen-
values), and the classical relaxation methods can be applied. We do not necessarily
recommend the use of −ATb as a preconditioner; we simply wish to show that a broad
range of matrices can be preconditioned into a form suitable for our analysis.
Aφ − f = 0 (9.13)
where A is symmetric negative definite.2 The symbol for the dependent variable has
been changed to φ as a reminder that the physics being modeled is no longer time
1
A permutation matrix (defined as a matrix with exactly one 1 in each row and column and has
the property that P T = P −1 ) just rearranges the rows and columns of a matrix.
2
We use a symmetric negative definite matrix to simplify certain aspects of our analysis. Relax-
ation methods are applicable to more general matrices. The classical methods will usually converge
if Ab is diagonally dominant, as defined in Appendix A.
9.1. FORMULATION OF THE MODEL PROBLEM 167
accurate when we later deal with ODE formulations. Note that the solution of Eq.
9.13, φ = A−1f , is guaranteed to exist because A is nonsingular. In the notation of
Eqs. 9.5 and 9.7,
The above was written to treat the general case. It is instructive in formulating the
concepts to consider the special case given by the diffusion equation in one dimension
with unit diffusion coefficient ν:
∂u ∂2 u
= 2 − g(x) (9.15)
∂t ∂x
This has the steady-state solution
∂2u
= g(x) (9.16)
∂x2
which is the one-dimensional form of the Poisson equation. Introducing the three-
point central differencing scheme for the second derivative with Dirichlet boundary
conditions, we find
du 1 − g
= B(1, −2, 1)u + (bc) (9.17)
dt ∆x2
contains the boundary conditions and g contains the values of the source
where (bc)
term at the grid nodes. In this case
1
Ab = B(1, −2, 1)
∆x2
fb
= g − (bc) (9.18)
where f = ∆x2 fb . If we consider a Dirichlet boundary condition on the left side and
either a Dirichlet or a Neumann condition on the right side, then A has the form
A = B 1, b, 1
168 CHAPTER 9. RELAXATION METHODS
s = −2 or −1 (9.20)
Note that s = −1 is easily obtained from the matrix resulting from the Neumann
boundary condition given in Eq. 3.24 using a diagonal conditioning matrix. A tremen-
dous amount of insight to the basic features of relaxation is gained by an appropriate
study of the one-dimensional case, and much of the remaining material is devoted to
this case. We attempt to do this in such a way, however, that it is directly applicable
to two- and three-dimensional problems.
where φ∞ was defined in Eq. 9.22. The residual at the nth iteration is defined as
Multiply Eq. 9.25 by A from the left, and use the definition in Eq. 9.26. There results
the relation between the error and the residual
Consequently, G is referred to as the basic iteration matrix, and its eigenvalues, which
we designate as σm , determine the convergence rate of a method.
In all of the above, we have considered only what are usually referred to as sta-
tionary processes in which H is constant throughout the iterations. Nonstationary
processes in which H (and possibly C) is varied at each iteration are discussed in
Section 9.5.
is based on the idea that if the correction produced by the Gauss-Seidel method tends
to move the solution toward φ∞ , then perhaps it would be better to move further in
this direction. It is usually expressed in two steps as
1 (n+1) (n)
φ̃j = φj−1 + φj+1 − fj
2
(n+1) (n) (n)
φj = φj + ω φ̃j − φj (9.32)
where ω generally lies between 1 and 2, but it can also be written in the single line
(n+1) ω (n+1) (n) ω (n) ω
φj = φj−1 + (1 − ω)φj + φj+1 − fj (9.33)
2 2 2
A=L+D+U (9.34)
Then the point-Jacobi method is obtained with H = −D, which certainly meets
the criterion that it is easy to solve. The Gauss-Seidel method is obtained with
H = −(L + D), which is also easy to solve, being lower triangular.
dφ
H = Aφ − f (9.35)
dt
This is equivalent to
dφ
= H −1 C Abφ − f b = H −1 [Aφ − f] (9.36)
dt
9.3. THE ODE APPROACH TO CLASSICAL RELAXATION 171
In the special case where H−1 A depends on neither u nor t, H −1f is also independent
of t, and the eigenvectors of H −1 A are linearly independent, the solution can be
written as
which is the solution of Eq. 9.13. We see that the goal of a relaxation method is to
remove the transient solution from the general solution in the most efficient way pos-
sible. The λ eigenvalues are fixed by the basic matrix in Eq. 9.36, the preconditioning
matrix in 9.7, and the secondary conditioning matrix in 9.35. The σ eigenvalues are
fixed for a given λh by the choice of time-marching method. Throughout the remain-
ing discussion we will refer to the independent variable t as “time”, even though no
true time accuracy is involved.
In a stationary method, H and C in Eq. 9.36 are independent of t, that is, they
are not changed throughout the iteration process. The generalization of this in our
approach is to make h, the “time” step, a constant for the entire iteration.
Suppose the explicit Euler method is used for the time integration. For this method
σm = 1 + λm h. Hence the numerical solution after n steps of a stationary relaxation
method can be expressed as (see Eq. 6.28)
The initial amplitudes of the eigenvectors are given by the magnitudes of the cm .
These are fixed by the initial guess. In general it is assumed that any or all of the
eigenvectors could have been given an equally “bad” excitation by the initial guess,
so that we must devise a way to remove them all from the general solution on an
equal basis. Assuming that H −1 A has been chosen (that is, an iteration process has
been decided upon), the only free choice remaining to accelerate the removal of the
error terms is the choice of h. As we shall see, the three classical methods have all
been conditioned by the choice of H to have an optimum h equal to 1 for a stationary
iteration process.
172 CHAPTER 9. RELAXATION METHODS
dφ
H = B(1, −2, 1)φ − f (9.40)
dt
As a start, let us use for the numerical integration the explicit Euler method
It is clear that the best choice of H from the point of view of matrix algebra is
−B(1, −2, 1) since then multiplication from the left by −B−1 (1, −2, 1) gives the cor-
rect answer in one step. However, this is not in the spirit of our study, since multi-
plication by the inverse amounts to solving the problem by a direct method without
iteration. The constraint on H that is in keeping with the formulation of the three
methods described in Section 9.2.3 is that all the elements above the diagonal (or
below the diagonal if the sweeps are from right to left) are zero. If we impose this
constraint and further restrict ourselves to banded tridiagonals with a single constant
in each band, we are led to
2
B(−β, , 0)(φn+1 − φn ) = B(1, −2, 1)φn − f (9.43)
ω
where β and ω are arbitrary. With this choice of notation the three methods presented
in Section 9.2.3 can be identified using the entries in Table 9.1.
β ω Method Equation
0 1 Point-Jacobi 6.2.3
1 1 Gauss-Seidel 6.2.4
1 2/ 1 + sin M π+ 1 Optimum SOR 6.2.5
9.4. EIGENSYSTEMS OF THE CLASSICAL METHODS 173
The fact that the values in the tables lead to the methods indicated can be verified
by simple algebraic manipulation. However, our purpose is to examine the whole
procedure as a special subset of the theory of ordinary differential equations. In this
light, the three methods are all contained in the following set of ODE’s
dφ 2
= B −1 (−β, , 0) B(1, −2, 1)φ − f (9.44)
dt ω
and appear from it in the special case when the explicit Euler method is used for its
numerical integration. The point operator that results from the use of the explicit
Euler scheme is
ωβ (n+1) ω ωh (n) ωh
(n+1) (n) (n)
φj = φj−1 + (h − β)φj−1 − (ωh − 1)φj + φj+1 − fj(9.45)
2 2 2 2
Abu − f b = 0 (9.46)
This equation is preconditioned in some manner which has the effect of multiplication
by a conditioning matrix C giving
Aφ − f = 0 (9.47)
dφ
H = Aφ − f (9.48)
dt
This solution has the analytical form
φn = en + φ∞ (9.49)
where en is the transient, or error, and φ∞ ≡ A−1f is the steady-state solution. The
three classical methods, Point-Jacobi, Gauss-Seidel, and SOR, are identified for the
one-dimensional case by Eq. 9.44 and Table 9.1.
174 CHAPTER 9. RELAXATION METHODS
Given our assumption that the component of the error associated with each eigen-
vector is equally likely to be excited, the asymptotic convergence rate is determined
by the eigenvalue σm of G (≡ I + H −1 A) having maximum absolute value. Thus
In this section, we use the ODE analysis to find the convergence rates of the three
classical methods represented by Eqs. 9.30, 9.31, and 9.32. It is also instructive to
inspect the eigenvectors and eigenvalues in the H−1 A matrix for the three methods.
This amounts to solving the generalized eigenvalue problem
2
B(1, −2, 1)xm = λm B(−β, , 0)xm (9.52)
ω
The generalized eigensystem for simple tridigonals is given in Appendix B.2. The
three special cases considered below are obtained with a = 1, b = −2, c = 1, d = −β,
e = 2/ω, and f = 0. To illustrate the behavior, we take M = 5 for the matrix order.
This special case makes the general result quite clear.
For M = 40, we obtain |σm |max = 0.9971. Thus after 500 iterations the error content
associated with each eigenvector is reduced to no more than 0.23 times its initial level.
Again from Appendix B.1, the eigenvectors of H−1 A are given by
xj = (xj )m = sin j mπ
, j = 1, 2, . . . , M (9.55)
M +1
This is a very “well-behaved” eigensystem with linearly independent eigenvectors and
distinct eigenvalues. The first 5 eigenvectors are simple sine waves. For M = 5, the
eigenvectors can be written as
√
√1/2 √
3/2
1
3/2 3/2 0
, ,
x3 = −1 ,
x1 =
√1 x2 = √0
3/2 − 3/2 0
√
1/2 − 3/2 1
√
3/2 1/2
√ √
− 3/2 − 3/2
x4 = ,
√0 x5 = √1 (9.56)
3/2 − 3/2
√
− 3/2 1/2
λ2 = −1 + 12 = −0.5
λ3 = −1 = −1.0 (9.57)
λ4 = −1 − 12 = −1.5
√
λ5 = −1 − 23 = −1.866 · · ·
1.0
h = 1.0
m=5 m=1
σm
m=4 m=2
Μ=5
m=3
0.0
-2.0 -1.0 0.0
+ c3 [1 − (1 )h]nx3
1
+ c4 [1 − (1 + )h]nx4
2√
3 n
+ c5 [1 − (1 + )h] x5 (9.58)
2
which can be studied using the results in Appendix B.2. One can show that the H−1 A
matrix for the Gauss-Seidel method, AGS , is
h < 1.0
h
σm
De
ng
cr
si
ea
ea
si
cr
ng
De
h
0.0
-2.0 -1.0 0.0
λ mh
Exceeds 1.0 at h ∼
∼ 1.072 / Ustable h > 1.072
1.0
h > 1.0
In
cr
ea
h
si
σm
ng
ng
si
ea
h
cr
In
M = 5
0.0
-2.0 -1.0 0.0
λmh
1.0
h = 1.0
σm
Μ=5
2 defective λ m
0.0
-2.0 -1.0 0.0
λmh
The eigenvectors and principal vectors are all real. For M = 5 they can be written
√
1/2 √ 3/2 1 0 0
3/4 3/4 0 2 0
x1 = 3/4 , x2 = ,
√0 x3 = 0 , x4 = −1 , x5 = 4 (9.65)
9/16 −√3/16 0 0 −4
9/32 − 3/32 0 0 1
λ1 = −1/4
λ2 = −3/4
λ3 = −1
(4) Defective, linked to (9.66)
(5) λ3 Jordan block
The numerical solution written in full is thus
−2x4 x4 0 0 0
−2x3 + x4 x3 − 2x4 x4 0 0
1
−2x2 + x3 x2 − 2x3 + x4 x3 − 2x4 x4 0 (9.68)
x5
−2x + x 2
x − 2x + x x − 2x + x
2 3 2 3 4
x − 2x4
3
x4
−2 + x 1 − 2x + x2 x − 2x2 + x3 x2 − 2x3 + x4 x3 − 2x4
3. If M is odd, one of the remaining eigenvalues is real and the others are complex
occurring in conjugate pairs.
9.4. EIGENSYSTEMS OF THE CLASSICAL METHODS 181
One can easily show that the optimum ω for the stationary case is
π
ωopt = 2/ 1 + sin (9.70)
M +1
and for ω = ωopt
2
λm = ζm −1
xm = ζ j−1 sin j mπ
m (9.71)
M +1
where
ωopt
ζm = pm + i p21 − p2m
2
Using the explicit Euler method to integrate the ODE’s, σm = 1 − h + hζm 2
, and if
h = 1, the optimum value for the stationary case, the λ-σ relation reduces to that
shown in Fig. 9.5. This illustrates the fact that for optimum stationary SOR all the
|σm | are identical and equal to ωopt − 1. Hence the convergence rate is
|σm |max = ωopt − 1 (9.72)
π
ωopt = 2/ 1 + sin
M +1
For M = 40, |σm |max = 0.8578. Hence the worst error component is reduced to less
than 0.23 times its initial value in only 10 iterations, much faster than both Gauss-
Seidel and Point-Jacobi. In practical applications, the optimum value of ω may have
to be determined by trial and error, and the benefit may not be as great.
For odd M , there are two real eigenvectors and one real principal vector. The
remaining linearly independent eigenvectors are all complex. For M = 5 they can be
written
√
1/2 −6 3(1 )/2 1
√ √
1/2
9
3(1 ± i 2)/6
0
x1 =
1/3 , x2 = 16 , x3,4 = √ 0√ ,
x5 = 1/3 (9.73)
1/6 13 √ 3(5 ± i√ 2)/54 0
1/18 6 3(7 ± 4i 2)/162 1/9
The corresponding eigenvalues are
λ1 = −2/3
(2) Defective linked to λ1
√
λ3 = −(10 − 2 2i)/9
√
λ4 = −(10 + 2 2i)/9
λ5 = −4/3 (9.74)
182 CHAPTER 9. RELAXATION METHODS
1.0
h = 1.0
2 real defective λ m
σm 2 complex λ m
1 real λ m
Μ=5
0.0
-2.0 -1.0 0.0
λmh
/
N /
N
φ = c1x1 (1 + λ1 hn ) + · · · + cmxm (1 + λm hn )
N
n=1 n=1
/
N
+ · · · + cM xM (1 + λM hn ) + φ∞ (9.76)
n=1
where the symbol Π stands for product. Since hn can now be changed at each step,
the error term can theoretically be completely eliminated in M steps by taking hm =
−1/λm , for m = 1, 2, · · · , M . However, the eigenvalues λm are generally unknown and
costly to compute. It is therefore unnecessary and impractical to set hm = −1/λm
for m = 1, 2, . . . , M . We will see that a few well chosen h’s can reduce whole clusters
of eigenvectors associated with nearby λ’s in the λm spectrum. This leads to the
concept of selectively annihilating clusters of eigenvectors from the error terms as
part of a total iteration process. This is the basis for the multigrid methods discussed
in Chapter 10.
Let us consider the very important case when all of the λm are real and nega-
tive (remember that they arise from a conditioned matrix so this constraint is not
unrealistic for quite practical cases). Consider one of the error terms taken from
M /
N
eN ≡ φN − φ∞ = cmxm (1 + λm hn ) (9.77)
m=1 n=1
where
Remember that all λ are negative real numbers representing the magnitudes of λm in
an eigenvalue spectrum.
The error in the relaxation process represented by Eq. 9.76 is expressed in terms
0
of a set of eigenvectors, xm , amplified by the coefficients cm (1 + λm hn ). With each
eigenvector there is a corresponding eigenvalue. Eq. 9.84 gives us the best choice of a
series of hn that will minimize the amplitude of the error carried in the eigenvectors
associated with the eigenvalues between λb and λa .
As an example for the use of Eq. 9.84, let us consider the following problem:
where
! √ √ "
T3 (3) = [3 + 8]3 + [3 − 8]3 /2 ≈ 99 (9.88)
A plot of Eq. 9.87 is given in Fig. 9.6 and we see that the amplitudes of all the
eigenvectors associated with the eigenvalues in the range −2 ≤ λ ≤ −1 have been
reduced to less than about 1% of their initial values. The values of h used in Fig. 9.6
are
√
h1 = 4/(6 − 3)
h2 = 4/(6 − 0)
√
h3 = 4/(6 + 3)
Return now to Eq. 9.76. This was derived from Eq. 9.37 on the condition that the
explicit Euler method, Eq. 9.41, was used to integrate the basic ODE’s. If instead
the implicit trapezoidal rule
1
φn+1 = φn + h(φn+1 + φn ) (9.89)
2
is used, the nonstationary formula
1
M /
N 1 + h λ
n m
φN = cmxm 2 +
1 φ∞ (9.90)
m=1 n=1 1 − h λ
n m
2
would result. This calls for a study of the rational “trapezoidal” polynomial, Pt :
1
/N 1 + hn λ
(Pt )N (λ) = 2 (9.91)
1
n=1 1 − h λ
n
2
under the same constraints as before, namely that
0.9
0.8
0.7
0.6
(Pe )3 (λ)
0.5
0.4
0.3
0.2
0.1
0.015
0.01
0.005
(P ) (λ)
e 3
−0.005
−0.01
−0.015
−0.02
−2 −1.8 −1.6 −1.4 −1.2 −1 −0.8 −0.6 −0.4 −0.2 0
λ
The optimum values of h can also be found for this problem, but we settle here for
the approximation suggested by Wachspress
(n−1)/(N −1)
2 λa
= −λb , n = 1, 2, · · · , N (9.93)
hn λb
This process is also applied to problem 9.85. The results for (Pt )3 (λ) are shown in
Fig. 9.7. The error amplitude is about 1/5 of that found for (Pe )3 (λ) in the same
interval of λ. The values of h used in Fig. 9.7 are
h1 = 1
√
h2 = 2
h3 = 2
9.6 Problems
1. Given a relaxation method in the form
n = Aφ
H∆φ n − f
show that
0 + (I − Gn )A−1 f
n = Gn φ
φ
where G = I + H −1 A.
2. For a linear system of the form (A1 + A2 )x = b, consider the iterative method
where µ is a parameter. Show that this iterative method can be written in the
form
0.9
0.8
0.7
0.6
(Pt )3 (λ)
0.5
0.4
0.3
0.2
0.1
1.5
0.5
(Pt )3 (λ)
−0.5
−1
−1.5
−2
−2 −1.8 −1.6 −1.4 −1.2 −1 −0.8 −0.6 −0.4 −0.2 0
λ
∂2u
− 6x = 0
∂x2
For the initial condition, use u(x) = 0. Use second-order centered differences
on a grid with 40 cells (M = 39). Iterate to steady state using
Plot the solution after the residual is reduced by 2, 3, and 4 orders of mag-
nitude. Plot the logarithm of the L2 -norm of the residual vs. the number of
iterations. Determine the asymptotic convergence rate. Compare with the the-
oretical asymptotic convergence rate.
190 CHAPTER 9. RELAXATION METHODS
Chapter 10
MULTIGRID
The idea of systematically using sets of coarser grids to accelerate the convergence of
iterative schemes that arise from the numerical solution to partial differential equa-
tions was made popular by the work of Brandt. There are many variations of the
process and many viewpoints of the underlying theory. The viewpoint presented here
is a natural extension of the concepts discussed in Chapter 9.
10.1 Motivation
10.1.1 Eigenvector and Eigenvalue Identification with Space
Frequencies
Consider the eigensystem of the model matrix B(1, −2, 1). The eigenvalues and
eigenvectors are given in Sections 4.3.2 and 4.3.3, respectively. Notice that as the
magnitudes of the eigenvalues increase, the space-frequency (or wavenumber) of the
corresponding eigenvectors also increase. That is, if the eigenvalues are ordered such
that
then the corresponding eigenvectors are ordered from low to high space frequencies.
This has a rational explanation from the origin of the banded matrix. Note that
∂2
sin(mx) = −m2 sin(mx) (10.2)
∂x2
and recall that
1 1
δxxφ =
2 B(1, −2, 1)φ = X D(λ) X −1φ (10.3)
∆x ∆x2
191
192 CHAPTER 10. MULTIGRID
where D(λ) is a diagonal matrix containing the eigenvalues. We have seen that
X −1φ represents a sine transform, and Xφ, a sine synthesis. Therefore, the opera-
tion 1 2 D(λ) represents the numerical approximation of the multiplication of the
∆x
appropriate sine wave by the negative square of its wavenumber, −m2 . One finds
that
2
1 M +1 mπ
2 λm = −2 + 2 cos ≈ −m2 , m << M (10.4)
∆x π M +1
Hence, the correlation of large magnitudes of λm with high space-frequencies is to be
expected for these particular matrix operators. This is consistent with the physics of
diffusion as well. However, this correlation is not necessary in general. In fact, the
complete counterexample of the above association is contained in the eigensystem
for B( 12 , 1, 21 ). For this matrix one finds, from Appendix B, exactly the opposite
behavior.
basic properties. This form can be arrived at “naturally” by simply replacing the
derivatives in the PDE with difference schemes, as in the example given by Eq. 3.27,
or it can be “contrived” by further conditioning, as in the examples given by Eq. 9.11.
The basic assumptions required for our description of the multigrid process are:
1. The problem is linear.
2. The eigenvalues, λm , of the matrix are all real and negative.
3. The λm are fairly evenly distributed between their maximum and minimum
values.
4. The eigenvectors associated with the eigenvalues having largest magnitudes can
be correlated with high frequencies on the differencing mesh.
5. The iterative procedure used greatly reduces the amplitudes of the eigenvectors
associated with eigenvalues in the range between 21 |λ|max and |λ|max.
These conditions are sufficient to ensure the validity of the process described next.
Having preconditioned (if necessary) the basic finite differencing scheme by a pro-
cedure equivalent to the multiplication by a matrix C, we are led to the starting
formulation
C[Abφ∞ − f b ] = 0 (10.5)
where the matrix formed by the product CAb has the properties given above. In Eq.
10.5, the vector f b represents the boundary conditions and the forcing function, if
any, and φ∞ is a vector representing the desired exact solution. We start with some
initial guess for φ∞ and proceed through n iterations making use of some iterative
process that satisfies property 5 above. We do not attempt to develop an optimum
procedure here, but for clarity we suppose that the three-step Richardson method
illustrated in Fig. 9.6 is used. At the end of the three steps we find r, the residual,
where
r = C[Abφ − f b ] (10.6)
Recall that the φ used to compute r is composed of the exact solution φ∞ and the
error e in such a way that
Ae − r = 0 (10.7)
where
A ≡ CAb (10.8)
194 CHAPTER 10. MULTIGRID
Thus our goal now is to solve for e. We can write the exact solution for e in terms of
the eigenvectors of A, and the σ eigenvalues of the Richardson process in the form:
M/2
/
3
M /
3
e = cmxm [σ(λm hn )] + cmxm [σ(λm hn )] (10.10)
m=1 n=1 m=M/2+1 n=1
very low amplitude
Combining our basic assumptions, we can be sure that the high frequency content of
e has been greatly reduced (about 1% or less of its original value in the initial guess).
In addition, assumption 4 ensures that the error has been smoothed.
Next we construct a permutation matrix which separates a vector into two parts,
one containing the odd entries, and the other the even entries of the original vector
(or any other appropriate sorting which is consistent with the interpolation approxi-
mation to be discussed below). For a 7-point example
e2 0 1 0 0 0 0 0 e1
e4 0 0 0 1 0 0 0 e2
e6 0 0 0 0 0 1 0 e3
ee
e1 = 1 0 0 0 0 0 0 e4 ; = P e (10.11)
eo
e3 0 0 1 0 0 0 0 e5
e5 0 0 0 0 1 0 0 e6
e7 0 0 0 0 0 0 1 e7
Multiply Eq. 10.7 from the left by P and, since a permutation matrix has an inverse
which is its transpose, we can write
Notice that
is an exact expression.
At this point we make our one crucial assumption. It is that there is some connec-
tion between ee and eo brought about by the smoothing property of the Richardson
relaxation procedure. Since the top half of the frequency spectrum has been removed,
it is reasonable to suppose that the odd points are the average of the even points.
For example
1
e1 ≈ (ea + e2 )
2
1
e3 ≈ (e2 + e4 )
2
1
e5 ≈ (e4 + e6 ) or eo = A2ee (10.15)
2
1
e7 ≈ (e6 + eb )
2
It is important to notice that ea and eb represent errors on the boundaries where the
error is zero if the boundary conditions are given. It is also important to notice that we
are dealing with the relation between e and r so the original boundary conditions and
forcing function (which are contained in f in the basic formulation) no longer appear
in the problem. Hence, no aliasing of these functions can occur in subsequent steps.
Finally, notice that, in this formulation, the averaging of e is our only approximation,
no operations on r are required or justified.
If the boundary conditions are Dirichlet, ea and eb are zero, and one can write for
the example case
1 0 0
1
1 1 0
A2 = (10.16)
2 0 1 1
0 0 1
or
where
The form of Ac , the matrix on the coarse mesh, is completely determined by the
choice of the permutation matrix and the interpolation approximation. If the original
A had been B(7 : 1, −2, 1), our 7-point example would produce
−2 1 1
−2 1 1
−2 1 1
A1 A2
P AP −1 = = (10.20)
1 −2
A3 A4
1 1 −2
1 1 −2
1 −2
A2
A1 A2 Ac
1
−2 1 1 −1 1/2
1 1 1
−2 + 1 1 · = 1/2 −1 1/2 (10.21)
2 1 1
−2 1 1 1/2 −1
1
and are illustrated in Fig. 10.1. All of them go through zero on the left (Dirichlet)
side, and all of them reflect on the right (Neumann) side.
For Neumann conditions, the interpolation formula in Eq. 10.15 must be changed.
In the particular case illustrated in Fig. 10.1, eb is equal to eM . If Neumann conditions
are on the left, ea = e1 . When eb = eM , the example in Eq. 10.16 changes to
1 0 0
1
1 1 0
A2 = (10.23)
2 0 1 1
0 0 2
10.2. THE BASIC PROCESS 197
The permutation matrix remains the same and both A1 and A2 in the partitioned
matrix P AP −1 are unchanged (only A4 is modified by putting −1 in the lower right
element). Therefore, we can construct the coarse matrix from
A2
A1 A2 Ac
1
−2 1 1 1 1 −1 1/2
1
−2 + 1 1 · =
1/2 −1 1/2
(10.24)
2 1 1
−2 1 1 1/2 −1/2
2
which gives us what we might have “expected.”
We will continue with Dirichlet boundary conditions for the remainder of this
Section. At this stage, we have reduced the problem from B(1, −2, 1)e = r on the
fine mesh to 12 B(1, −2, 1)ee = re on the next coarser mesh. Recall that our goal is
to solve for e, which will provide us with the solution φ∞ using Eq. 10.9. Given ee
computed on the coarse grid (possibly using even coarser grids), we can compute eo
using Eq. 10.15, and thus e. In order to complete the process, we must now determine
the relationship between ee and e.
In order to examine this relationship, we need to consider the eigensystems of A
and Ac :
A = XΛX −1, Ac = Xc Λc Xc−1 (10.25)
For A = B(M : 1, −2, 1) the eigenvalues and eigenvectors are
mπ mπ j = 1, 2, · · · , M
λm = −2 1 − cos , xm = sin j , (10.26)
M +1 M +1 m = 1, 2, · · · , M
198 CHAPTER 10. MULTIGRID
Based on our assumptions, the most difficult error mode to eliminate is that with
m = 1, corresponding to
π π
λ1 = −2 1 − cos , x1 = sin j , j = 1, 2, · · · , M (10.27)
M +1 M +1
For example, with M = 51, λ1 = −0.003649. If we restrict our attention to odd M ,
then Mc = (M − 1)/2 is the size of Ac . The eigenvalue and eigenvector corresponding
to m = 1 for the matrix Ac = 12 B(Mc , 1, −2, 1) are
2π 2π
(λc )1 = − 1 − cos , (xc )1 = sin j , j = 1, 2, · · · , Mc (10.28)
M +1 M +1
For M = 51 (Mc = 25), we obtain (λc )1 = −0.007291 = 1.998λ1 . As M increases,
(λc )1 approaches 2λ1 . In addition, one can easily see that (xc )1 coincides with x1 at
every second point of the latter vector, that is, it contains the even elements of x1 .
Now let us consider the case in which all of the error consists of the eigenvector
component x1 , i.e., e = x1 . Then the residual is
r = Ax1 = λ1x1 (10.29)
since (xc )1 contains the even elements of x1 . The exact solution on the coarse grid
satisfies
ee = A−1 −1 −1
c re = Xc Λc Xc λ1 (xc )1
(10.31)
1
0
= λ1 Xc Λ−1
c
..
.
0
(10.32)
1/(λc )1
0
= λ1 Xc
..
.
0
(10.33)
10.2. THE BASIC PROCESS 199
λ1
= (xc )1
(λc )1
(10.34)
1
≈ (xc )1 (10.35)
2
Since our goal is to compute e = x1 , in addition to interpolating ee to the fine grid
(using Eq. 10.15), we must multiply the result by 2. This is equivalent to solving
1
Ac ee = re (10.36)
2
or
1
B(Mc : 1, −2, 1)ee = re (10.37)
4
In our case, the matrix A = B(M : 1, −2, 1) comes from a discretization of the
diffusion equation, which gives
ν
Ab = B(M : 1, −2, 1) (10.38)
∆x2
and the preconditioning matrix C is simply
∆x2
C= I (10.39)
ν
Applying the discretization on the coarse grid with the same preconditioning matrix
as used on the fine grid gives, since ∆xc = 2∆x,
ν ∆x2 1
C 2 B(Mc : 1, −2, 1) = 2
B(Mc : 1, −2, 1) = B(Mc : 1, −2, 1) (10.40)
∆xc ∆xc 4
which is precisely the matrix appearing in Eq. 10.37. Thus we see that the process is
recursive. The problem to be solved on the coarse grid is the same as that solved on
the fine grid.
The remaining steps required to complete an entire multigrid process are relatively
straightforward, but they vary depending on the problem and the user. The reduction
can be, and usually is, carried to even coarser grids before returning to the finest level.
However, in each case the appropriate permutation matrix and the interpolation
approximation define both the down- and up-going paths. The details of finding
optimum techniques are, obviously, quite important but they are not discussed here.
200 CHAPTER 10. MULTIGRID
1. Perform n1 iterations of the selected relaxation method on the fine grid, starting
=φ
with φ (1) . This gives1
n . Call the result φ
φ n + (I − Gn1 ) A−1 f
(1) = Gn1 φ (10.41)
1 1
where
G1 = I + H1−1A1 (10.42)
and H1 is defined as in Chapter 9 (e.g., Eq. 9.21). Next compute the residual based
(1) :
on φ
n + A (I − Gn1 ) A−1 f − f
(1) − f = AGn1 φ
r(1) = Aφ 1 1
n1 n1 −1
= AG φn − AG A f
1 1 (10.43)
Here A2 can be formed by applying the discretization on the coarse grid. In the
preceding example (eq. 10.40), A2 = 41 B(Mc : 1, −2, 1). It is at this stage that the
generalization to a multigrid procedure with more than two grids occurs. If this is the
coarsest grid in the sequence, solve exactly. Otherwise, apply the two-grid process
recursively.
4. Transfer (or prolong) the error back to the fine grid and update the solution:
φ (1) − I 1e(2)
n+1 = φ (10.47)
2
[I − I21 A−1 2 n1
2 R1 A]G1 (10.50)
The eigenvalues of this matrix determine the convergence rate of the two-grid process.
The basic iteration matrix for a three-grid process is found from Eq. 10.50 by
replacing A−1 −1
2 with (I − G3 )A2 , where
2
In this expression n2 is the number of relaxation steps on grid 2, I32 and R23 are the
transfer operators between grids 2 and 3, and A3 is obtained by discretizing on grid
3. Extension to four or more grids proceeds in similar fashion.
202 CHAPTER 10. MULTIGRID
10.4 Problems
1. Derive Eq. 10.51.
Solve exactly on the coarsest grid. Plot the solution after the residual is reduced
by 2, 3, and 4 orders of magnitude. Plot the logarithm of the L2 -norm of the
residual vs. the number of iterations. Determine the asymptotic convergence
rate. Calculate the theoretical asymptotic convergence rate and compare.
Chapter 11
NUMERICAL DISSIPATION
203
204 CHAPTER 11. NUMERICAL DISSIPATION
∂u ∂u
= −a (11.1)
∂t ∂x
with periodic boundary conditions. Consider the following point operator for the
spatial derivative term
−a
−a(δx u)j = [−(1 + β)uj−1 + 2βuj + (1 − β)uj+1]
2∆x
−a
= [(−uj−1 + uj+1) + β(−uj−1 + 2uj − uj+1)] (11.2)
2∆x
The second form shown divides the operator into an antisymmetric component (−uj−1+
uj+1)/2∆x and a symmetric component β(−uj−1 + 2uj − uj+1)/2∆x. The antisym-
metric component is the second-order centered difference operator. With β = 0, the
operator is only first-order accurate. A backward difference operator is given by β = 1
and a forward difference operator is given by β = −1.
For periodic boundary conditions the corresponding matrix operator is
−a
−aδx = Bp (−1 − β, 2β, 1 − β)
2∆x
We see that the antisymmetric portion of the operator introduces odd derivative
terms in the truncation error while the symmetric portion introduces even derivatives.
Substituting this into Eq. 11.1 gives
or
r = −κ2 (ν − τ κ2 ), s = −κ(a + γκ2 )
The solution is composed of both amplitude and phase terms. Thus
u = e −κ (ν−τ
2 2
κ ) iκ[x−(a+γκ )t] 2
e (11.5)
amplitude phase
206 CHAPTER 11. NUMERICAL DISSIPATION
It is important to notice that the amplitude of the solution depends only upon ν and
τ , the coefficients of the even derivatives in Eq. 11.4, and the phase depends only on
a and γ, the coefficients of the odd derivatives.
If the wave speed a is positive, the choice of a backward difference scheme (β = 1)
produces a modified PDE with ν − τ κ2 > 0 and hence the amplitude of the solution
decays. This is tantamount to deliberately adding dissipation to the PDE. Under the
same condition, the choice of a forward difference scheme (β = −1) is equivalent to
deliberately adding a destabilizing term to the PDE.
By examining the term governing the phase of the solution in Eq. 11.5, we see
that the speed of propagation is a + γκ2 . Referring to the modified PDE, Eq. 11.3
we have γ = −a∆x2 /6. Therefore, the phase speed of the numerical solution is less
than the actual phase speed. Furthermore, the numerical phase speed is dependent
upon the wavenumber κ. This we refer to as dispersion.
Our purpose here is to investigate the properties of one-sided spatial differencing
operators relative to centered difference operators. We have seen that the three-
point centered difference approximation of the spatial derivative produces a modified
PDE that has no dissipation (or amplification). One can easily show, by using the
antisymmetry of the matrix difference operators, that the same is true for any cen-
tered difference approximation of a first derivative. As a corollary, any departure
from antisymmetry in the matrix difference operator must introduce dissipation (or
amplification) into the modified PDE.
Note that the use of one-sided differencing schemes is not the only way to in-
troduce dissipation. Any symmetric component in the spatial operator introduces
dissipation (or amplification). Therefore, one could choose β = 1/2 in Eq. 11.2. The
resulting spatial operator is not one-sided but it is dissipative. Biased schemes use
more information on one side of the node than the other. For example, a third-order
backward-biased scheme is given by
1
(δx u)j = (uj−2 − 6uj−1 + 3uj + 2uj+1)
6∆x
1
= [(uj−2 − 8uj−1 + 8uj+1 − uj+2)
12∆x
+ (uj−2 − 4uj−1 + 6uj − 4uj+1 + uj+2)] (11.6)
∂u 1 2 ∂ 2 u
u(x, t + h) = u + h + h 2
+ O(h3) (11.7)
∂t 2 ∂t
First replace the time derivatives with space derivatives according to the PDE (in
this case, the linear convection equation ∂∂tu + a ∂∂xu = 0). Thus
∂u ∂u ∂2u 2
2∂ u
= −a , = a (11.8)
∂t ∂x ∂t2 ∂x2
Now replace the space derivatives with three-point centered difference operators, giv-
ing
2
(n+1) (n) 1 ah (n) (n) 1 ah (n) (n) (n)
uj = uj − (uj+1 − uj−1) + (uj+1 − 2uj + uj−1 ) (11.9)
2 ∆x 2 ∆x
For | ∆x
ah
| ≤ 1 all of the eigenvalues have modulus less than or equal to unity and hence
the method is stable independent of the sign of a. The quantity |∆x ah
| is known as the
Courant (or CFL) number. It is equal to the ratio of the distance travelled by a wave
in one time step to the mesh spacing.
208 CHAPTER 11. NUMERICAL DISSIPATION
The nature of the dissipative properties of the Lax-Wendroff scheme can be seen
by examining the modified partial differential equation, which is given by
∂u ∂u a ∂ 3 u a2 h ∂4u
+a = − (∆x2 − a2 h2 ) 3 − (∆x2 − a2 h2 ) 4 + . . .
∂t ∂x 6 ∂x 8 ∂x
This is derived by substituting Taylor series expansions for all terms in Eq. 11.9 and
converting the time derivatives to space derivatives using Eq. 11.8. The two leading
error terms appear on the right side of the equation. Recall that the odd derivatives on
the right side lead to unwanted dispersion and the even derivatives lead to dissipation
(or amplification, depending on the sign). Therefore, the leading error term in the
Lax-Wendroff method is dispersive and proportional to
a ∂3u a∆x2 ∂3u
− (∆x2 − a2 h2 ) 3 = − (1 − Cn2 ) 3
6 ∂x 6 ∂x
The dissipative term is proportional to
a2 h ∂4u a2 h∆x2 ∂4u
− (∆x2 − a2 h2 ) 4 = − (1 − Cn2 ) 4
8 ∂x 8 ∂x
This term has the appropriate sign and hence the scheme is truly dissipative as long
as Cn ≤ 1.
A closely related method is that of MacCormack. Recall MacCormack’s time-
marching method, presented in Chapter 6:
ũn+1 = un + hun
1
un+1 = [un + ũn+1 + hũn+1 ] (11.10)
2
If we use first-order backward differencing in the first stage and first-order forward
differencing in the second stage,1 a dissipative second-order method is obtained. For
the linear convection equation, this approach leads to
a− = a ≤ 0. Now for the a+ (≥ 0) term we can safely backward difference and for the
a− (≤ 0) term forward difference. This is the basic concept behind upwind methods,
that is, some decomposition or splitting of the fluxes into terms which have positive
and negative characteristic speeds so that appropriate differencing schemes can be
chosen. In the next two sections, we present two splitting techniques commonly used
with upwind methods. These are by no means unique.
The above approach to obtaining a stable discretization independent of the sign
of a can be written in a different, but entirely equivalent, manner. From Eq. 11.2, we
see that a stable discretization is obtained with β = 1 if a ≥ 0 and with β = −1 if
a ≤ 0. This is achieved by the following point operator:
−1
−a(δx u)j = [a(−uj−1 + uj+1) + |a|(−uj−1 + 2uj − uj+1)] (11.13)
2∆x
This approach is extended to systems of equations in Section 11.5.
In this section, we present the basic ideas of flux-vector and flux-difference splitting.
For more subtle aspects of implementation and application of such techniques to
210 CHAPTER 11. NUMERICAL DISSIPATION
nonlinear hyperbolic systems such as the Euler equations, the reader is referred to
the literature on this subject.
∂u ∂f ∂u ∂u
+ = +A =0 (11.14)
∂t ∂x ∂t ∂x
can be decoupled into characteristic equations of the form
∂wi ∂wi
+ λi =0 (11.15)
∂t ∂x
where the wave speeds, λi , are the eigenvalues of the Jacobian matrix, A, and the
wi ’s are the characteristic variables. In order to apply a one-sided (or biased) spatial
differencing scheme, we need to apply a backward difference if the wave speed, λi , is
positive, and a forward difference if the wave speed is negative. To accomplish this,
let us split the matrix of eigenvalues, Λ, into two components such that
Λ = Λ+ + Λ− (11.16)
where
Λ + |Λ| Λ − |Λ|
Λ+ = , Λ− = (11.17)
2 2
With these definitions, Λ+ contains the positive eigenvalues and Λ− contains the neg-
ative eigenvalues. We can now rewrite the system in terms of characteristic variables
as
∂w ∂w ∂w ∂w ∂w
+Λ = + Λ+ + Λ− =0 (11.18)
∂t ∂x ∂t ∂x ∂x
The spatial terms have been split into two components according to the sign of the
wave speeds. We can use backward differencing for the Λ+ ∂∂xw term and forward
differencing for the Λ− ∂∂xw term. Premultiplying by X and inserting the product
X −1 X in the spatial terms gives
∂u ∂A+ u ∂A− u
+ + =0 (11.21)
∂t ∂x ∂x
Finally the split flux vectors are defined as
f + = A+ u, f − = A− u (11.22)
∂u ∂f + ∂f −
+ + =0 (11.23)
∂t ∂x ∂x
In the linear case, the definition of the split fluxes follows directly from the defini-
tion of the flux, f = Au. For the Euler equations, f is also equal to Au as a result of
their homogeneous property, as discussed in Appendix C. Note that
f = f+ + f− (11.24)
Thus by applying backward differences to the f+ term and forward differences to the
f − term, we are in effect solving the characteristic equations in the desired manner.
This approach is known as flux-vector splitting.
When an implicit time-marching method is used, the Jacobians of the split flux
vectors are required. In the nonlinear case,
∂f + ∂f −
= A+ , = A− (11.25)
∂u ∂u
Therefore, one must find and use the new Jacobians given by
∂f + ∂f −
A++ = , A−− = (11.26)
∂u ∂u
For the Euler equations, A++ has eigenvalues which are all positive, and A−− has all
negative eigenvalues.
2
With these definitions A+ has all positive eigenvalues, and A− has all negative eigenvalues.
212 CHAPTER 11. NUMERICAL DISSIPATION
and we have also used the relations f = Xg, u = Xw, and A = XΛX−1 .
In the linear, constant-coefficient case, this leads to an upwind operator which is
identical to that obtained using flux-vector splitting. However, in the nonlinear case,
there is some ambiguity regarding the definition of |A| at the cell interface j + 1/2.
In order to resolve this, consider a situation in which the eigenvalues of A are all of
the same sign. In this case, we would like our definition of fˆj+1/2 to satisfy
fL if all λi s > 0
fˆj+1/2 = (11.35)
fR if all λi s < 0
giving pure upwinding. If the eigenvalues of A are all positive, |A| = A; if they are
all negative, |A| = −A. Hence satisfaction of Eq. 11.35 is obtained by the definition
1 1
fˆj+1/2 = (fL + fR ) + |Aj+1/2 | (uL − uR ) (11.36)
2 2
if Aj+1/2 satisfies
For the Euler equations for a perfect gas, Eq. 11.37 is satisfied by the flux Jacobian
evaluated at the Roe-average state given by
√ √
ρL uL + ρR uR
uj+1/2 = √ √ (11.38)
ρL + ρR
√ √
ρL HL + ρR HR
Hj+1/2 = √ √ (11.39)
ρL + ρR
where u and H = (e + p)/ρ are the velocity and the total enthalpy per unit mass,
respectively.3
or
where |A| is defined in Eq. 11.34. The second spatial term is known as artificial
dissipation. It is also sometimes referred to as artificial diffusion or artificial viscosity.
With appropriate choices of δxa and δxs , this approach can be related to the upwind
approach. This is particularly evident from a comparison of Eqs. 11.36 and 11.43.
It is common to use the following operator for δxs
(δxs u)j = (uj−2 − 4uj−1 + 6uj − 4uj+1 + uj+2) (11.44)
∆x
where is a problem-dependent coefficient. This symmetric operator approximates
∆x3 uxxxx and thus introduces a third-order dissipative term. With an appropriate
value of , this often provides sufficient damping of high frequency modes without
greatly affecting the low frequency modes. For details of how this can be implemented
for nonlinear hyperbolic systems, the reader should consult the literature. A more
complicated treatment of the numerical dissipation is also required near shock waves
and other discontinuities, but is beyond the scope of this book.
11.6 Problems
1. A second-order backward difference approximation to a 1st derivative is given
as a point operator by
1
(δx u)j = (uj−2 − 4uj−1 + 3uj )
2∆x
11.6. PROBLEMS 215
(a) Express this operator in banded matrix form (for periodic boundary condi-
tions), then derive the symmetric and skew-symmetric matrices that have
the matrix operator as their sum. (See Appendix A.3 to see how to con-
struct the symmetric and skew-symmetric components of a matrix.)
(b) Using a Taylor table, find the derivative which is approximated by the
corresponding symmetric and skew-symmetric operators and the leading
error term for each.
2. Find the modified wavenumber for the first-order backward difference operator.
Plot the real and imaginary parts of κ∗ ∆x vs. κ∆x for 0 ≤ κ∆x ≤ π. Using
Fourier analysis as in Section 6.6.2, find |σ| for the combination of this spatial
operator with 4th-order Runge-Kutta time marching at a Courant number of
unity and plot vs. κ∆x for 0 ≤ κ∆x ≤ π.
3. Find the modified wavenumber for the operator given in Eq. 11.6. Plot the real
and imaginary parts of κ∗ ∆x vs. κ∆x for 0 ≤ κ∆x ≤ π. Using Fourier analysis
as in Section 6.6.2, find |σ| for the combination of this spatial operator with
4th-order Runge-Kutta time marching at a Courant number of unity and plot
vs. κ∆x for 0 ≤ κ∆x ≤ π.
6. Show that the flux Jacobian for the 1-D Euler equations can be written in terms
of u and H. Show that the use of the Roe average state given in Eqs. 11.38 and
11.39 leads to satisfaction of Eq. 11.37.
216 CHAPTER 11. NUMERICAL DISSIPATION
Chapter 12
In the next two chapters, we present and analyze split and factored algorithms. This
gives the reader a feel for some of the modifications which can be made to the basic
algorithms in order to obtain efficient solvers for practical multidimensional applica-
tions, and a means for analyzing such modified forms.
2. Advancing to the next time level always requires some reference to a previous
one.
3. Time marching methods are valid only to some order of accuracy in the step
size, h.
du
= Au − f (12.1)
dt
217
218 CHAPTER 12. SPLIT AND FACTORED FORMS
du
= [A1 + A2 ]u − f (12.2)
dt
where A = [A1 + A2 ] but A1 and A2 are not unique. For the time march let us choose
the simple, first-order,1 explicit Euler method. Then, from observation 2 (new data
un+1 in terms of old un ):
or its equivalent
un+1 = [ I + hA1 ][ I + hA2 ] − h2 A1 A2 un − hf + O(h2 )
Finally, from observation 3 (allowing us to drop higher order terms −h2 A1 A2un ):
Notice that Eqs. 12.3 and 12.4 have the same formal order of accuracy and, in
this sense, neither one is to be preferred over the other. However, their numerical
stability can be quite different, and techniques to carry out their numerical evaluation
can have arithmetic operation counts that vary by orders of magnitude. Both of these
considerations are investigated later. Here we seek only to apply to some simple cases
the concept of factoring.
du
= Acu + Adu + (bc) (12.5)
dt
where Ac and Ad are matrices representing the convection and dissipation terms,
respectively; and their sum forms the A matrix we have considered in the previous
sections. Choose again the explicit Euler time march so that
+ O(h2 )
un+1 = [ I + hAd + hAc ]un + h(bc) (12.6)
1
Second-order time-marching methods are considered later.
12.2. FACTORING PHYSICAL REPRESENTATIONS — TIME SPLITTING 219
and we see that Eq. 12.7 and the original unfactored form Eq. 12.6 have identical
orders of accuracy in the time approximation. Therefore, on this basis, their selection
is arbitrary. In practical applications2 equations such as 12.7 are often applied in a
predictor-corrector sequence. In this case one could write
ũn+1 = [ I + hAc ]un + h(bc)
un+1 = [ I + hAd ]ũn+1 (12.8)
Factoring can also be useful to form split combinations of implicit and explicit
techniques. For example, another way to approximate Eq. 12.6 with the same order
of accuracy is given by the expression
un+1 = [ I − hAd ]−1 [ I + hAc ]un + h(bc)
+O(h2 )
= [ I + hAd + hAc ]un + h(bc) (12.9)
Original Unfactored Terms
if h · ||Ad || < 1, where ||Ad || is some norm of [Ad ]. This time a predictor-corrector
interpretation leads to the sequence
ũn+1 = [ I + hAc ]un + h(bc)
[ I − hAd ]un+1 = ũn+1 (12.10)
The convection operator is applied explicitly, as before, but the diffusion operator is
now implicit, requiring a tridiagonal solver if the diffusion term is central differenced.
Since numerical stiffness is generally much more severe for the diffusion process, this
factored form would appear to be superior to that provided by Eq. 12.8. However,
the important aspect of stability has yet to be discussed.
2
We do not suggest that this particular method is suitable for use. We have yet to determine its
stability, and a first-order time-march method is usually unsatisfactory.
220 CHAPTER 12. SPLIT AND FACTORED FORMS
We should mention here that Eq. 12.9 can be derived for a different point of view
by writing Eq. 12.6 in the form
un+1 − un + O(h2 )
= Ac un + Ad un+1 + (bc)
h
Then
[ I − hAd ]un+1 = [ I + hAc ]un + h(bc)
which is identical to Eq. 12.10.
My
13 23 33 43
k
12 22 32 42
1
11 21 31 41
(12.12)
1 j · · · Mx
3
This could also be called a 5 × 6 point mesh if the boundary points (labeled
in the sketch)
were included, but in these notes we describe the size of a mesh by the number of interior points.
12.3. FACTORING SPACE MATRIX OPERATORS IN 2–D 221
for these space vectors are given in Eq. 12.16 part a and b.
where
Pyx = PTxy = P−1
xy
dU
= Ax+y U + (bc) (12.14)
dt
If it is important to be specific about the data-base structure, we use the notation
A(x) (y)
x+y or Ax+y , depending on the data–base chosen for the U it multiplies. Examples
are in Eq. 12.16 part a and b. Notice that the matrices are not the same although
they represent the same derivative operation. Their structures are similar, however,
and they are related by the same permutation matrix that relates U(x) to U (y) . Thus
(x) (y)
Ax+y = Pxy · Ax+y · Pyx (12.15)
4
Notice that Ax+y and U , which are notations used in the special case of space vectors, are
subsets of A and u, used in the previous sections.
222 CHAPTER 12. SPLIT AND FACTORED FORMS
• x | o | 11
x • x | o | 21
x • x | o |
31
x • | o | 41
−−
o | • x | o 12
o | x • x | o 22
(x)
Ax+y · U (x) = ·
o | x • x | o
32
o | x • | o 42
−−
| o | • x 13
| o | x • x 23
| o | x • x 33
| o | x • 43
a:Elements in 2-dimensional, central-difference, matrix
operator, Ax+y , for 3×4 mesh shown in Sketch 12.12.
Data base composed of My x–vectors stored in U (x) .
Entries for x → x, for y → o, for both → •.
(12.16)
• o | x | | 11
o • o | x | | 12
o • | x | | 13
−−
x | • o | x | 21
x | o • o | x |
22
x | o • | x | 23
(y)
Ax+y · U (y) = · −−
| x | • o | x 31
| | • |
x o o x 32
| x | o • | x
33
−−
| | x | • o
41
| | x | o • o 42
| | x | o • 43
(y)
Ax+y = A(y) (y)
x + Ay (12.18)
A(x) (y)
y = Pxy Ay Pyx
and
A(x) (y)
x = Pxy Ax Pyx
The splittings in Eqs. 12.17 and 12.18 can be combined with factoring in the
manner described in Section 12.2. As an example (first-order in time), applying the
implicit Euler method to Eq. 12.14 gives
(x)
Un+1 = Un(x) + h A(x) (x) (x)
x + Ay Un+1 + h(bc)
or
I − hA(x)
(x) + O(h2 )
x − hAy
(x)
Un+1 = Un(x) + h(bc) (12.19)
As in Section 12.2, we retain the same first order accuracy with the alternative
I − hA(x) I − hA(x)
(x) + O(h2)
Un+1 = Un(x) + h(bc) (12.20)
x y
Write this in predictor-corrector form and permute the data base of the second row.
There results
I − hA(x) (x)
= Un(x) + h(bc)
x Ũ
(y)
I − hA(y)
y Un+1 = Ũ (y) (12.21)
224 CHAPTER 12. SPLIT AND FACTORED FORMS
x x | |
x x x | |
x x x | |
x x | |
| x x |
| x x x |
x ·U
A(x) · U (x)
(x)
=
| x x x |
| x x |
| | x x
| | x x x
| | x x x
| | x x
(12.22)
o | o |
o | o |
o | o |
o | o |
o | o | o
o | o | o
y ·U
A(x) · U (x)
(x)
=
o | o | o
o | o | o
| o | o
| o | o
| o | o
| o | o
x | x | |
x | x | |
x | x | |
x | x | x |
x | x | x |
x | x | x |
x ·U
A(y) · U (y)
(y)
=
| x | x | x
| | |
x x x
| x | x | x
| | x | x
| | x | x
| | x | x
(12.23)
o o | | |
o o o | | |
o o | | |
| o o | |
| o o o | |
| o o | |
y ·U
A(y) · U (y)
(y)
=
| | o o |
| | o o o |
| | o o |
| | | o o
| | | o o o
| | | o o
(y)
The splitting of Ax+y .
226 CHAPTER 12. SPLIT AND FACTORED FORMS
for realistic problems. Consider, for example, the problem of computing the time
advance in the unfactored form of the trapezoidal method given by Eq. 12.24
1 1
I − hAx+y Un+1 = I + hAx+y Un + h(bc)
2 2
Forming the right hand side poses no problem, but finding Un+1 requires the solution
of a sparse, but very large, set of coupled simultaneous equations having the matrix
form shown in Eq. 12.16 part a and b. Furthermore, in real cases involving the Euler
or Navier-Stokes equations, each symbol (o, x, •) represents a 4 × 4 block matrix with
entries that depend on the pressure, density and velocity field. Suppose we were to
solve the equations directly. The forward sweep of a simple Gaussian elimination fills6
all of the 4 × 4 blocks between the main and outermost diagonal7 (e.g. between •
and o in Eq. 12.16 part b.). This must be stored in computer memory to be used to
find the final solution in the backward sweep. If Ne represents the order of the small
block matrix (4 in the 2-D Euler case), the approximate memory requirement is
(Ne × My ) · (Ne × My ) · Mx
floating point words. Here it is assumed that My < Mx . If My > Mx , My and Mx
would be interchanged. A moderate mesh of 60 × 200 points would require over 11
million words to find the solution. Actually current computer power is able to cope
rather easily with storage requirements of this order of magnitude. With computing
speeds of over one gigaflop,8 direct solvers may become useful for finding steady-state
solutions of practical problems in two dimensions. However, a three-dimensional
solver would require a memory of approximatly
Ne2 · My2 · Mz2 · Mx
words and, for well resolved flow fields, this probably exceeds memory availability for
some time to come.
On the other hand, consider computing a solution using the factored implicit equa-
tion 12.25. Again computing the right hand side poses no problem. Accumulate the
result of such a computation in the array (RHS). One can then write the remaining
terms in the two-step predictor-corrector form
1 (x) (x)
I − hAx Ũ = (RHS)(x)
2
1 (y) (y)
I − hAy Un+1 = Ũ (y) (12.28)
2
6
For matrices as small as those shown there are many gaps in this “fill”, but for meshes of
practical size the fill is mostly dense.
7
The lower band is also computed but does not have to be saved unless the solution is to be
repeated for another vector.
8
One billion floating-point operations per second.
228 CHAPTER 12. SPLIT AND FACTORED FORMS
which has the same appearance as Eq. 12.21 but is second-order time accurate. The
first step would be solved using My uncoupled block tridiagonal solvers9 . Inspecting
the top of Eq. 12.22, we see that this is equivalent to solving My one-dimensional
problems, each with Mx blocks of order Ne . The temporary solution Ũ (x) would then
be permuted to Ũ (y) and an inspection of the bottom of Eq. 12.23 shows that the final
step consists of solving Mx one-dimensional implicit problems each with dimension
My .
The point at which the factoring is made may not affect the order of time-accuracy,
but it can have a profound effect on the stability and convergence properties of a
method. For example, the unfactored form of a first-order method derived from the
implicit Euler time march is given by Eq. 12.19, and if it is immediately factored,
the factored form is presented in Eq. 12.20. On the other hand, the delta form of the
unfactored Eq. 12.19 is
[I − hAx − hAy ]∆Un = h Ax+y Un + (bc)
In spite of the similarities in derivation, we will see in the next chapter that the
convergence properties of Eq. 12.20 and Eq. 12.31 are vastly different.
12.7 Problems
1. Consider the 1-D heat equation:
∂u ∂2u
=ν 2 0≤x≤9
∂t ∂x
Let u(0, t) = 0 and u(9, t) = 0, so that we can simplify the boundary conditions.
Assume that second order central differencing is used, i.e.,
1
(δxx u)j = (uj−1 − 2uj + uj+1)
∆x2
The uniform grid has ∆x = 1 and 8 interior points.
iv. The generic ODE representing the discrete form of the heat equation
is
du(1)
= A1 u(1) + f
dt
Write down the matrix A1 . (Note f = 0, due to the boundary condi-
tions) Next find the matrix A2 such that
du(2)
= A2 u(2)
dt
Note that A2 can be written as
D UT
A2 =
U D
Define D and U .
v. Applying implicit Euler time marching, write the delta form of the
implicit algorithm. Comment on the form of the resulting implicit
matrix operator.
(b) System definition
In problem 1a, we defined u(1) , u(2) , A1 , A2 , P12 , and P21 which partition
the odd points from the even points. We can put such a partitioning to
use. First define extraction operators
1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0
I4 04
0 0 0 1 0 0 0 0
=
I (o) =
0 0 0 0 0 0 0 0
04 04
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
04 04
0 0 0 0 0 0 0 0
=
I (e) =
0 0 0 0 1 0 0 0
04 I4
0 0 0 0 0 1 0 0
0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 1
12.7. PROBLEMS 231
which extract the odd even points from u(2) as follows: u(o) = I (o) u(2) and
u(e) = I (e) u(2).
i. Beginning with the ODE written in terms of u(2), define a splitting
A2 = Ao + Ae , such that Ao operates only on the odd terms, and Ae
operates only on the even terms. Write out the matrices Ao and Ae .
Also, write them in terms of D and U defined above.
ii. Apply implicit Euler time marching to the split ODE. Write down the
delta form of the algorithm and the factored delta form. Comment on
the order of the error terms.
iii. Examine the implicit operators for the factored delta form. Comment
on their form. You should be able to argue that these are now trangu-
lar matrices (a lower and an upper). Comment on the solution process
this gives us relative to the direct inversion of the original system.
232 CHAPTER 12. SPLIT AND FACTORED FORMS
Chapter 13
In Section 4.4 we introduced the concept of the representative equation, and used
it in Chapter 7 to study the stability, accuracy, and convergence properties of time-
marching schemes. The question is: Can we find a similar equation that will allow
us to evaluate the stability and convergence properties of split and factored schemes?
The answer is yes — for certain forms of linear model equations.
The analysis in this chapter is useful for estimating the stability and steady-state
properties of a wide variety of time-marching schemes that are variously referred
to as time-split, fractional-step, hybrid, and (approximately) factored. When these
methods are applied to practical problems, the results found from this analysis are
neither necessary nor sufficient to guarantee stability. However, if the results indicate
that a method has an instability, the method is probably not suitable for practical
use.
233
234 CHAPTER 13. LINEAR ANALYSIS OF SPLIT AND FACTORED FORMS
Note that each λa pairs with one, and only one2 , λb since they must share a common
eigenvector. This suggests (see Section 4.4:
∂u ∂u ∂2u
+a =ν 2 (13.4)
∂t ∂x ∂x
2
This is to be contrasted to the developments found later in the analysis of 2-D equations.
13.2. EXAMPLE ANALYSIS OF CIRCULANT SYSTEMS 235
1.2 1.2
C = 2/ R
n ∆ C = R /2
n ∆
C = 2/ R
0.8 0.8 n ∆
C C
n n
0.4 0.4
0 0
0 4 8 0 4 8
R∆ R∆
Explicit-Implicit Explicit-Explicit
A simple numerical parametric study of Eq. 13.7 shows that the critical range of
θm for any combination of Cn and R∆ occurs when θm is near 0 (or 2π). From this
we find that the condition on Cn and R∆ that make |σ| ≈ 1 is
2
Cn
1 + Cn2 sin2 = 1 + 4 sin2
R∆ 2
As → 0 this gives the stability region
2
Cn <
R∆
which is bounded by a hyperbola and shown in Fig. 13.1.
Accuracy
From the characteristic polynomial we see that there are two σ–roots and they are
given by the equation
5
2
3 1 3 1 1
1 + hλc + hλd ± 1 + hλc + hλd − 2hλc 1 − hλd
σ= 2 2
2 2 2 (13.10)
1
2 1 − hλd
2
The principal σ-root follows from the plus sign and one can show
1 1
σ1 = 1 + (λc + λd )h + (λc + λd )2 h2 + λ3d + λc λ2d − λ2c λd − λ3c h3
2 4
From this equation it is clear that 16 λ3 = 16 (λc + λd )3 does not match the coefficient
of h3 in σ1 , so
erλ = O(h3 )
Using P (eµh ) and Q(eµh ) to evaluate erµ in Section 6.6.3, one can show
erµ = O(h3 )
using either Eq. 13.8 or Eq. 13.9. These results show that, for the model equation,
the hybrid method retains the second-order accuracy of its individual components.
Stability
The stability of the method can be found from Eq. 13.10 by a parametric study of cn
and R∆ defined in Eq. 13.7. This was carried out in a manner similar to that used
to find the stability boundary of the first-order explicit-implicit method in Section
13.2.1. The results are plotted in Fig. 13.2. For values of R∆ ≥ 2 this second-order
method has a much greater region of stability than the first-order explicit-implicit
method given by Eq. 12.10 and shown in Fig. 13.1.
1.2
0.8
Cn
Stable
0.4
0 4 8
R
∆
and
∂u ∂u ∂u
+ ax + ay =0 (13.12)
∂t ∂x ∂y
where B is a banded matrix of the form B(b−1 , b0 , b1 ). Now find the block eigenvector
240 CHAPTER 13. LINEAR ANALYSIS OF SPLIT AND FACTORED FORMS
λ1 I + Λ̃
λ2 I + Λ̃
(13.15)
λ3 I + Λ̃
λ4 I + Λ̃
13.3. THE REPRESENTATIVE EQUATION FOR SPACE-SPLIT OPERATORS241
• The diagonal matrix on the right side of Eq. 13.15 contains every possible com-
bination of the individual eigenvalues of B and B̃.
Now we are ready to present the representative equation for two dimensional sys-
tems. First reduce the PDE to ODE by some choice4 of space differencing. This
results in a spatially split A matrix formed from the subsets
A(x)
x = diag(B) , A(y)
y = diag(B̃) (13.16)
where B and B̃ are any two matrices that have linearly independent eigenvectors (this
puts some constraints on the choice of differencing schemes).
Although Ax and Ay do commute, this fact, by itself, does not ensure the prop-
erty of “all possible combinations”. To obtain the latter property the structure of
the matrices is important. The block matrices B and B̃ can be either circulant or
noncirculant; in both cases we are led to the final result:
Often we are interested in finding the value of, and the convergence rate to, the
steady-state solution of the representative equation. In that case we set µ = 0 and
use the simpler form
du
= [λx + λy ]u + a (13.17)
dt
which has the exact solution
a
u(t) = ce(λx +λy )t − (13.18)
λx + λy
4
We have used 3-point central differencing in our example, but this choice was for convenience
only, and its use is not necessary to arrive at Eq. 13.15.
242 CHAPTER 13. LINEAR ANALYSIS OF SPLIT AND FACTORED FORMS
1. The stability.
(1 − h λx − h λy )un+1 = un + ha
from which
P (E) = (1 − h λx − h λy )E − 1
Q(E) = h (13.19)
1. Is unconditionally stable.
2. Produces the exact (see Eq. 13.18) steady-state solution (of the ODE) for any
h.
Unfortunately, however, use of this method for 2-D problems is generally impractical
for reasons discussed in Section 12.5.
13.4. EXAMPLE ANALYSIS OF 2-D MODEL EQUATIONS 243
1. Is unconditionally stable.
2. Produces the exact steady-state solution for any choice of h.
3. Converges very slowly to the steady–state solution for large values of h, since
|σ| → 1 as h → ∞.
Like the factored nondelta form, this method demands far less storage than the un-
factored form, as discussed in Section 12.5. The correct steady solution is obtained,
but convergence is not nearly as rapid as that of the unfactored form.
a
− (13.23)
λx + λy + λz
6
Eqs. 13.11 and 13.12, each with an additional term.
246 CHAPTER 13. LINEAR ANALYSIS OF SPLIT AND FACTORED FORMS
or in the form
n
1 1 1 1
1 + 2 hλx 1 + hλy 1 + hλz − h3 λx λy λz a
un = c 2
2
4 − (13.24)
1 1 1 λx + λy + λz
1 − hλx 1 − hλy 1 − hλz
2 2 2
It is interesting to notice that a Taylor series expansion of Eq. 13.24 results in
1
σ = 1 + h(λx + λy + λz ) + h2 (λx + λy + λz )2 (13.25)
2
1 3 3
+ h λz + (2λy + 2λx ) + 2λ2y + 3λx λy + 2λ2y + λ3y + 2λx λ2y + 2λ2x λy + λ3x + · · ·
4
which verifies the second order accuracy of the factored form. Furthermore, clearly,
if the method converges, it converges to the proper steady-state.7
With regards to stability, it follows from Eq. 13.23 that, if all the λ’s are real and
negative, the method is stable for all h. This makes the method unconditionally stable
for the 3-D diffusion model when it is centrally differenced in space.
Now consider what happens when we apply this method to the biconvection model,
the 3-D form of Eq. 13.12 with periodic boundary conditions. In this case, central
differencing causes all of the λ’s to be imaginary with spectrums that include both
positive and negative values. Remember that in our analysis we must consider every
possible combination of these eigenvalues. First write the σ root in Eq. 13.23 in the
form
1 + iα − β + iγ
σ=
1 − iα − β + iγ
where α, β and γ are real numbers that can have any sign. Now we can always find
one combination of the λ’s for which α, and γ are both positive. In that case since
the absolute value of the product is the product of the absolute values
(1 − β)2 + (α + γ)2
|σ|2 = >1
(1 − β)2 + (α − γ)2
and the method is unconditionally unstable for the model convection problem.
From the above analysis one would come to the conclusion that the method rep-
resented by Eq. 13.22 should not be used for the 3-D Euler equations. In practical
cases, however, some form of dissipation is almost always added to methods that are
used to solve the Euler equations and our experience to date is that, in the presence
of this dissipation, the instability disclosed above is too weak to cause trouble.
7
However, we already knew this because we chose the delta form.
13.6. PROBLEMS 247
13.6 Problems
1. Starting with the generic ODE,
du
= Au + f
dt
we can split A as follows: A = A1 + A2 + A3 + A4 . Applying implicit Euler time
marching gives
un+1 − un
= A1 un+1 + A2 un+1 + A3 un+1 + A4 un+1 + f
h
(a) Write the factored delta form. What is the error term?
(b) Instead of making all of the split terms implicit, leave two explicit:
un+1 − un
= A1 un+1 + A2 un + A3 un+1 + A4 un + f
h
Write the resulting factored delta form and define the error terms.
(c) The scalar representative equation is
du
= (λ1 + λ2 + λ3 + λ4 )u + a
dt
For the fully implicit scheme of problem 1a, find the exact solution to the
resulting scalar difference equation and comment on the stability, conver-
gence, and accuracy of the converged steady-state solution.
(d) Repeat 1c for the explicit-implicit scheme of problem 1b.
248 CHAPTER 13. LINEAR ANALYSIS OF SPLIT AND FACTORED FORMS
Appendix A
A.1 Notation
1. In the present context a vector is a vertical column or string. Thus
v1
v2
v =
..
.
vm
T
and its transpose v is the horizontal row
v T = [v1 , v2 , v3 , . . . , vm ] , v = [v1 , v2 , v3 , . . . , vm ]T
249
250APPENDIX A. USEFUL RELATIONS AND DEFINITIONS FROM LINEAR ALGEBRA
4. The inverse of a matrix (if it exists) is written A−1 and has the property that
A−1 A = AA−1 = I, where I is the identity matrix.
A.2 Definitions
1. A is symmetric if AT = A.
2. A is skew-symmetric or antisymmetric if AT = −A.
6 6
3. A is diagonally dominant if aii ≥ j=i |aij | , i = 1, 2, . . . , m and aii > j=i |aij |
for at least one i.
4. A is orthogonal if aij are real and AT A = AAT = I
5. Ā is the complex conjugate of A.
6. P is a permutation matrix if P v is a simple reordering of v.
6
7. The trace of a matrix is i aii .
8. A is normal if AT A = AAT .
9. det[A] is the determinant of A.
10. AH is the conjugate transpose of A, (Hermitian).
11. If
a b
A=
c d
then
det[A] = ad − bc
and
−1 1 d −b
A =
det[A] −c a
A.3. ALGEBRA 251
A.3 Algebra
We consider only square matrices of the same dimension.
2. A + (B + C) = (C + A) + B, etc.
4. In general AB = BA.
5. Transpose equalities:
(A + B)T = AT + B T
(AT )T = A
(AB)T = B T AT
(A−1 )−1 = A
(AB)−1 = B −1 A−1
(AT )−1 = (A−1 )T
A.4 Eigensystems
1. The eigenvalue problem for a matrix A is defined as
2. If a square matrix with real elements is symmetric, its eigenvalues are all real.
If it is asymmetric, they are all imaginary.
252APPENDIX A. USEFUL RELATIONS AND DEFINITIONS FROM LINEAR ALGEBRA
for any complex a and b and for all combinations of vectors in the set.
X −1 AX = Λ
8. In general, the eigenvalues of a matrix may not be distinct, in which case the
possibility exists that it cannot be diagonalized. If the eigenvalues of a matrix
are not distinct, but all of the eigenvectors are linearly independent, the matrix
is said to be derogatory, but it can still be diagonalized.
10. Defective matrices cannot be diagonalized but they can still be put into a com-
pact form by a similarity transform, S, such that
J1 0 ··· 0
. . . ..
0 J2 .
J = S −1 AS =
.. ... ...
. 0
0 · · · 0 Jk
12. Use of the transform S is known as putting A into its Jordan Canonical form.
A repeated root in a Jordan block is referred to as a defective eigenvalue. For
each Jordan submatrix with an eigenvalue λi of multiplicity r, there exists one
eigenvector. The other r − 1 vectors associated with this eigenvalue are referred
to as principal vectors. The complete set of principal vectors and eigenvectors
are all linearly independent.
then
λ 1 0 λ 0 0
P 0 λ 1P = 1 λ 0
−1
0 0 λ 0 1 λ
14. Some of the Jordan subblocks may have the same eigenvalue. For example, the
254APPENDIX A. USEFUL RELATIONS AND DEFINITIONS FROM LINEAR ALGEBRA
matrix
λ1 1
λ1 1
λ1
λ1
λ1 1
λ1
λ2 1
λ2
λ3
is both defective and derogatory, having:
• 9 eigenvalues
• 3 distinct eigenvalues
• 3 Jordan blocks
• 5 linearly independent eigenvectors
• 3 principal vectors with λ1
• 1 principal vector with λ2
||Av||p
||A||p = max
x=0 ||v||p
A.5. VECTOR AND MATRIX NORMS 255
4. Let A and B be square matrices of the same order. All matrix norms must have
the properties
6. In general σ(A) does not satisfy the conditions in 4, so in general σ(A) is not a
true norm.
7. When A is normal, σ(A) is a true norm, in fact, in this case it is the L2 norm.
8. The spectral radius of A, σ(A), is the lower bound of all the norms of A.
256APPENDIX A. USEFUL RELATIONS AND DEFINITIONS FROM LINEAR ALGEBRA
Appendix B
SOME PROPERTIES OF
TRIDIAGONAL MATRICES
257
258 APPENDIX B. SOME PROPERTIES OF TRIDIAGONAL MATRICES
These vectors are the columns of the right-hand eigenvector matrix, the elements of
which are
j −1
a 2 sin jmπ j = 1, 2, · · · , M
X = (xjm ) = , (B.4)
c M +1 m = 1, 2, · · · , M
Notice that if a = −1 and c = 1,
j −1
a 2 = ei(j−1) π
2 (B.5)
c
The left-hand eigenvector matrix of B(a, b, c) can be written
m − 1
−1 2 c 2 mjπ m = 1, 2, · · · , M
X = sin ,
M +1 a M +1 j = 1, 2, · · · , M
In this case notice that if a = −1 and c = 1
m − 1
c 2 −i(m−1) π
=e 2 (B.6)
a
if
mπ
b − λm e + 2 (a − λm d)(c − λm f ) cos =0 , m = 1, 2, · · · , M (B.8)
M +1
If we define
mπ
θm = , pm = cos θm
M +1
B.3. THE INVERSE OF A SIMPLE TRIDIAGONAL 259
eb − 2(cd + af )p2m + 2pm (ec − f b)(ea − bd) + [(cd − af )pm ]2
λm =
e2 − 4f dp2m
The right-hand eigenvectors are
j −1
xm = a − λm d 2
sin [jθm ] ,
m = 1, 2, · · · , M
c − λm f j = 1, 2, · · · , M
DM ≡ det[B(M : a, b, c)]
D0 = 1
D1 = b
D2 = b2 − ac
D3 = b3 − 2abc (B.9)
Eq. B.10 is a linear O∆E the solution of which was discussed in Section 4.2. Its
characteristic polynomial P (E) is P (E 2 − bE + ac) and the two roots to P (σ) = 0
result in the solution
√ M +1 √ M +1
1 b + b − 4ac
2 b − b − 4ac
2
DM = √ −
b2 − 4ac 2 2
M = 0, 1, 2, · · · (B.11)
where we have made use of the initial conditions D0 = 1 and D1 = b. In the limiting
case when b2 − 4ac = 0, one can show that
M
b
DM = (M + 1) ; b2 = 4ac
2
260 APPENDIX B. SOME PROPERTIES OF TRIDIAGONAL MATRICES
Then for M = 4
D3 −cD2 c2 D1 −c3 D0
1 −aD2 D1 D2 −cD1 D1 c2 D1
B −1 =
D4 a2 D1 −aD1 D1 D2 D1 −cD2
−a3 D0 a2 D1 −aD2 D3
and for M = 5
D4 −cD3 c2 D2 −c3 D1 c4 D0
−aD D1 D3 −cD1 D2 c2 D1 D1 −c3 D1
1 2
3
B −1 = a D2 −aD1 D2 D2 D2 −cD2 D1 c2 D2
D5
−a3 D1
a2 D1 D1 −aD2 D1 D3 D1 −cD3
a4 D0 −a3 D1 a2 D2 −aD3 D4
Upper triangle:
m = 1, 2, · · · , M − 1 ; n = m + 1, m + 2, · · · , M
Diagonal:
n = m = 1, 2, · · · , M
dmm = DM −1 DM −m /DM
Lower triangle:
m = n + 1, n + 2, · · · , M ; n = 1, 2, · · · , M − 1
Bp (M : a, b, c, ) (B.12)
B.4. EIGENSYSTEMS OF CIRCULANT MATRICES 261
Consider a mesh with an even number of interior points such as that shown in Fig.
B.1. One can seek from the tridiagonal matrix B(2M : a, b, a, ) the eigenvector subset
that has even symmetry when spanning the interval 0 ≤ x ≤ π. For example, we seek
the set of eigenvectors xm for which
b a x1 x1
a b x2
a x2
... . .
a .
. .
. = λm
.
... . .
a .
..
a b a
x2
x2
a b x1 x1
b a
a b a
..
a .
B(M : a, b, a)xm =
...
xm = λmxm (B.16)
a
a b a
a b+a
By folding the known eigenvectors of B(2M : a, b, a) about the center, one can show
from previous results that the eigenvalues of eq. B.16 are
(2m − 1)π
λm = b + 2a cos , m = 1, 2, · · · , M (B.17)
2M + 1
B.6. SPECIAL CASES INVOLVING BOUNDARY CONDITIONS 263
(2m − 1)π
λm = b + 2a cos , m = 1, 2, · · · , M
2M
−2 1
1 −2 1
λm = −2 + 2 cos Mmπ
1 −2 1 + 1 (B.18)
xm = sin j mπ
1 −2 1 M +1
1 −2
When one boundary condition is Dirichlet and the other is Neumann (and a diagonal
preconditioner is applied to scale the last equation),
−2 1
(2m − 1)π
1 −2 1 λm = −2 + 2 cos 2M + 1
1 −2 1 (B.19)
1 −2 1
xm = sin j (2m − 1)π
2M + 1
1 −1
Appendix C
THE HOMOGENEOUS
PROPERTY OF THE EULER
EQUATIONS
The Euler equations have a special property that is sometimes useful in constructing
numerical methods. In order to examine this property, let us first inspect Euler’s
theorem on homogeneous functions. Consider first the scalar case. If F (u, v) satisfies
the identity
∂F ∂F
u +v = nF (u, v) (C.2)
∂u ∂v
Consider next the theorem as it applies to systems of equations. If the vector
F (Q) satisfies the identity
∂F
Q = nF (Q) (C.4)
∂q
265
266APPENDIX C. THE HOMOGENEOUS PROPERTY OF THE EULER EQUATIONS
Now it is easy to show, by direct use of eq. C.3, that both E and F in eqs. 2.11 and
2.12 are homogeneous of degree 1, and their Jacobians, A and B, are homogeneous
of degree 0 (actually the latter is a direct consequence of the former).1 This being
the case, we notice that the expansion of the flux vector in the vicinity of tn which,
according to eq. 6.105 can be written in general as,
E = En + An (Q − Qn ) + O(h2 )
F = Fn + Bn (Q − Qn ) + O(h2 ) (C.5)
can be written
E = An Q + O(h2 )
F = Bn Q + O(h2 ) (C.6)
∂F (Q) ∂F ∂Q ∂Q
= =A (C.7)
∂x ∂Q ∂x ∂x
∂F ∂Q ∂A
=A + Q (C.8)
∂x ∂x ∂x
∂A
Q=0 (C.9)
∂x
in spite of the fact that individually [∂A/∂x] and Q are not equal to zero.
1
Note that this depends on the form of the equation of state. The Euler equations are homoge-
neous if the equation of state can be written in the form p = ρf (), where is the internal energy
per unit mass.
Bibliography
[2] C. Hirsch. Numerical Computation of Internal and External Flows, volume 1,2.
John Wiley & Sons, New York, 1988.
267