HTM-2013-IEEE
HTM-2013-IEEE
ABSTRACT In this paper, a new ordinary differential equation numerical integration method is successfully
applied to various mathematical branches such as partial differential equation (PDE) boundary problems,
PDE initial-boundary problems, tough nonlinear equations, and so forth. The new method does not use
Jacobian, so it can handle very large systems, say the dimension N = 1 000 000, or even larger. In addition,
we give a very simple accelerating convergence approach for the linear algebraic equations arising from
linear PDE boundary problems. All the numerical results show that the new method is very promising for
super large scale systems.
INDEX TERMS Super large scale systems, numerical solution, ODE method, PDE boundary problem,
nonlinear equations, linear equations, inexact Newton method, GMRES(m), parallel computation.
In [7] we developed a set of explicit ODE integration In order to strengthen comprehension for our method we
methods, among them the methods with order = 1,2, possess give a comparison as following: The existing pseudo transient
very large stability region, in the direction of the real axis, continuation methods can be written as
which can be extended to infinite point. This fact might raise
doubts, an explanation can be found in [8]. un+1 = un − (δn−1 I + F 0 (un ))−1 F(un )
Just based on this fact — explicit ODE method with very It is clear that when δn → 0 the algorithm is close to
large stability region — we can use it as a tool to handle a explicit Euler method, when δ → ∞ the algorithm is close to
variety of large scale systems: stiff equations, linear algebraic Newton method. Various algorithms adopt different strategies
equations, nonlinear equations, and so on. for controlling δn . All these algorithms require the solutions
of linear equations, so we refer them as implicit methods.
II. THE ALGORITHM DESCRIPTION Now let us see what happens in our method.
In [7] we gave a set of ODE numerical integration methods, Consider at the node point n+2. If ε = h, we have ω = 1/2
among them the 1-st order method was very suitable to solve and
nonlinear equations. In [6] a derivation of this method has
been given. u0n+2 = un+1 − Zn+1 = un − Zn+1 − Zn+1
In this paper we just give a description of this 1-st order = un − 2 ∗ ω(hF(u0n+1 ) + Zn )
method and describe how to apply this method to solve non- = un − Zn − hF(u0n+1 ) = u0n+1 − hF(u0n+1 )
linear equations.
For nonlinear equations F(x) = 0, we consider the differ- So we can get conclusion that for node point n + 2, the value
ential equations initial value problem: of u0n+2 is given exactly by the explicit Euler method. That is
to say for ε = h our method is equivalent to the explicit Euler
Ẋ = F(X ) method. This fact has been proven by a lot numerical results.
X (0) = X0 When ε → ∞ our algorithm is close to the implicit
Our ODE integration method can be written as following: Euler method, which has a very large stability region. For
(0)
model equation ẋ = λx, the stability region includes the
Xn+1 = Xn + Zn
whole hλ-plane but a disk with center (1, 0) and radius 1. For
(0)
Zn+1 = ω(F(Xn+1 ) + Zn ) this implicit method, the PEC scheme turns it into explicit
X
n+1 = Xn + Zn+1 method. So our method is named as EXPLICIT PSEUDO-
TRANSIENT CONTINUATION by C. T. Kelley and Li-Zhi
Here Z0 = hF(X0 ), > 0 is a parameter, h is the step size,
Liao [8]. Adjusting parameter ε we can still get very large
ω = h/( + h).
stability region. So the ε plays a very important role for the
The basic idea using the ODE method to solve a nonlin-
stability. This is why our method differs from traditional ones:
ear equation is that the solution of the nonlinear equation
It is explicit, but has a very large stability region.
F(X ) = 0 can be interpreted as a steady or equilibrium point
As we pointed in section I, the 2-nd order method (Trape-
of the dynamical system Ẋ = F(X ).
zoidal rule), which had been developed in [7], also has a very
We know that nonlinear equations +F(X ) = 0 and
large stability region, but it is not L-stable [9] p. 236. How-
−F(X ) = 0 are equivalent, but for dynamical systems
ever the 1-st order implicit Euler method for model equation
Ẋ = +F(X ) and Ẋ = −F(X ) are entirely different. Only
∂ ∂ ẋ = λx, which can be expressed as xn+1 = 1/(1 + hλ)xn .
when ∂x (+F(X )) < 0 or ∂x (−F(X )) < 0 the dynamical
When hλ → ∞, 1/(1 + hλ) → 0, this means it is L-stable.
system can have equilibrium point. That is to say, the ODE
For large step size, xn+1 will be towards zero quickly, so we
method is allowed to solve F(X ) = 0, F(X ) must satisfy half-
can expect a fast convergence rate for our iterative process.
plane condition and must choose a proper sign ‘‘+’’ or ‘‘−’’.
Usually, in optimization area, ∂/∂xF(X ) is recognized to be
III. THE IMPLEMENTATION OF THE ALGORITHM AND
positive definite, so in [8] the algorithm was written as:
THE STEP CONTROL STRATEGY
For u0 = −F(u) u(0) = u0
So far we have not developed an adaptive program which can
n = 0; Z0 = hF(u0 ) ω = h+ h
automatically choose parameter and step size h. Since our
u01 = u0 − Z0
method is able to use in many different numerical calculation
Compute F(u01 ) and ρ = kF(u01 )k
areas, each area should have a different adaptive program to
While ρ > τ do
satisfy the special needs.
Zn+1 = ω(F(u0n+1 ) + Zn ) For example, in optimization area the differential equation
un+1 = un − Zn+1 Ẋ = −∇f (X )
n←n+1
is said to have gradient structure. By chain rule we have
u0n+1 = un − zn
N N
∂f dxi ∂f 2
Compute F(u0n+1 ) and ρ = kF(u0n+1 )k d X X
f (X ) = =− = −k∇f (X )k2 .
end while. dt ∂xi dt ∂xi
i=1 i=1
It demonstrates the analysis solution X (t) makes the Implicit methods are usually adopted for PDEs boundary and
f (X (t)) decrease in Euclidean norm as t increases. So many initial value problems. If Newton method is used, how to
existing adaptive programs ask to check if the inequality solve the linear equations will be an important subject. No
f (Xn+1 ) < f (Xn ) holds or not. In our method we take the matter direct or iterative methods, for moderate systems the
step size so large that it will produce very large local error, preconditioning skill is required, this will increase complexity
the numerical solution Xn may go far from analysis solution greatly.
X (tn ), so in our program, checking the inequality is not nec- The large or very large systems, evoke a rapid development
essary. of parallel algorithms. For dense matrix LU factorization,
Our experience is that if only the sequence of points {Xn } for symmetric matrix Cholesky factorization, various parallel
keeps within the attraction domain of the equilibrium point algorithms have been proposed. For parallel iterative meth-
X ∗ , the norm of F(X ) is allowed to have some extent of ods, there are Jacobi, Gauss-Seidel, SOR iteration ( [11], [12])
oscillation. Sometimes the oscillation may be very violent, as well as preconditioned conjugate gradient (PCG). Apart
the difference of the norm may reach as high as five mag- from these traditional methods’ parallel algorithms, the par-
nitude order, after a while the oscillation disappears, the allel multi-splitting iterative methods have been developed.
sequence {Xn } progresses towards the X ∗ while kF(X )k is Among all these methods, the Jacobi iteration is easiest to
decreasing. So we just need to take measure to avoid overflow implement, but it is recognized as less efficiency. We will see
and restrict amplitude of the oscillation within a limit, say, our method is powerful and is as simple as the Jacobi iteration
in the process we don’t allow kF(Xn+1 )k > 104 kF(Xn )k, for nonlinear equations. So it should be the most suitable for
if this case occurs, we reduce the step size by a proper parallel calculation among all nonlinear solvers.
proportion. In order to avoid the difficulty for solving linear equa-
For stiff ODEs not only the steady state but also the tran- tions, people turn their attention to the explicit PDEs algo-
sient process is to be concerned. So the strategy of step size rithm. Since 1980’s Evans and Abdullah published a series of
control is totally different from the strategy adopted by a explicit algorithms for parabolic equations (e.g. [13], [14]) to
nonlinear solver. It should control truncation error when you meet the requirement of parallel calculation.
want to increase the step size. Ours is even more suitable for parallel calculation, when
In the linear case in [10] we have discussed how to choose compared with all those methods and other existing methods.
the parameters and h, and got some results. It can be utilized Here’s another thing which is worth to mention. For regular
as a reference for nonlinear case, but using these results, spatial domain the 1,2,3-dimension problems, our method has
we need to possess the knowledge of the spectrum of the the same simplicity, this unusual feature makes our method to
Jacobian. This may raise some new troubles, unless someone be very efficient for handling High-Spatial-Dimension prob-
who engaged in some special areas and have a deeper under- lems.
standing of the system.
As a general and simple strategy, we propose the following IV. THE RESULTS
way: Separate the calculation into several stages, at the begin- A. PDE BOUNDARY VALUE PROBLEM
ning, the kF(X )k is relatively large, so taking smaller step size It is well known, the nonlinear PDE Boundary Value Problem
h1 , in process of time, when kF(X )k < Tol1, we enlarge step is an important source of the large and super large scale non-
size h and repeat the above procedures until kF(X )k < Toli, linear equations. For example, a three space dimension PDE
the stopping condition satisfied for certain Toli. boundary value problem, if each spatial direction has 100
Another important thing is that we replace dynamical equa- mesh point, it will produce 1,000,000 dimension nonlinear
tion Ẋ = F(X ) by Ẋ = −D−1 (X )F(X ), here D(X ) is the equations. It will be a hard task for any existing method to
diagonal matrix of the Jacobian. If D(X ) is nonsingular, the handle it.
systems Ẋ = −D−1 (X )F(X ) and Ẋ = F(X ) have the same In order to show the capability of our method, we construct
equilibrium point. an artificial problem as our example 1:
Despite the Jacobian does not appear in our algorithm,
its property still has an important effect on the calculation. uxx + uyy + uzz − (y2 z2 + x 2 z2 + x 2 y2 )u2 /exyz = 0
Taking the measure, divided by D(X ), we can deliver the
following benefits: Obviously, u(x, y, z) = exyz satisfies the above equation. We
1) Concentrate the widely distributed eigenvalues of the use the 2nd order centered finite differences to form the dis-
Jacobian on a smaller region, which can speed up con- crete equation. On the boundary of the domain = [0, 1]3 ,
vergence greatly. we take the values of exyz as the boundary value and mesh
2) Help us to choose the parameter . If the Jacobian of the width HB = 1.0/101.0. Then a boundary value problem
system is diagonal dominant, choosing = 0.5 will be was formed, which comprises 1,000,000 dimension nonlinear
very reasonable. equations.
3) Avoid the trouble to determine the sign ‘‘+’’ or ‘‘−’’ in Because in our method the Jacobian is not needed, we
the front of the F(X ). No matter the F(X ) is negative or can directly use 3-dimension array U (102, 102, 102) to write
positive the D−1 (X )F(X ) is always positive. our code. The values of x = 0, 1, y = 0, 1, z = 0, 1 are
The initial values: Obviously, U (x, y) = sin x sin y satisfies the equation. We
discretize the differential equation by normal 5-point formula
u(0, I , J , K ) = 1 + sin((I − 1)HB) with the uniform mesh width HB = π/501, then for the inte-
+ sin((J − 1)HB) + sin((K − 1)HB) rior mesh point ((I − 1)HB, (J − 1)HB)(I , J = 2, 3, . . . , 501)
According to [15] p. 269, using forward difference method the function values U (I , J ) = U ((I − 1)HB, (J − 1)HB)
(explicit Euler method) to integrate this equation, the step comprise a system of linear equations L(U (I , J )) = 0 with
size h should satisfy h ≤ HB2 /6α. This inequality means dimension = 250,000. The boundary point values are merged
the spectral radius ρ of the coefficient matrix is 2/HB2 /6α. into right hand items. In our case they are all equal to zero.
Based on the analysis of [10], our ODE method should choose The rest arguments are similar as those in the example 1.
< (4/3)(1/2)(HB2 /6α), then for any h, 0 < h < ∞, our We omit them, just list the results of both algorithms in Table 4
method is stable. and 5. The iteration initial values are UI ,J = U (I , J ) =
For α = 10, the results of example 2 are listed in Table 2 0.5, [I , J = 2, 3, . . . , 501].
where we use the error of u(t, 52, 52, 52) as a representative to
give a general idea of the accuracy of our calculation. NFE has TABLE 4. The results of ODE method for example 3 ε = 0.5, initial
Inf. Norm of L(U (I, J)) is 0.100031 × 101 .
the same meaning as example 1. The interval of integration is
[0, 1], at the end point the error is 0.3128 × 10−4 .
For explicit Euler method if we take h = HB2 /6α, then
(ρh) will be at the margin of the stability region. In order to
obtain better stability we take h = 0.9 × HB2 /6α. The results
are listed in Table 3
Then combining any iteration with inexact Newton method as The total NL = 9082
following: The time consuming = 268
1) Giving Newton iteration initial value X0 , calculating Max Error = 0.822352 × 10−5
F(X0 ) = −AX0 + b. We have known the Jacobian = From the data above, we can see that the linear iteration
−A, so we get the linear equation A1X = F(X0 ) number is 9082, it seems too large comparing with 1906
2) For the linear equation A1X = F(X0 ), put the iteration which is on the last row of Table 4, the result of non accel-
initial value 1X0 = 0.0 erating method, but the time proportion is 268/557. That is to
3) Write the initial residual norm ||R(1X0 )|| = ||A1X0 − say, despite our new approach needs more iterations, the work
F(X0 )|| = R0 . amount of each iteration is greatly less than non accelerating
4) Using any iteration method to solve 1X , the iter- method and can get a very accurate solution.
ation stopping condition is Rn /R0 < 0.1, here For GMRES (50), we still use simplified 5-line band Jaco-
Rn = ||R(1Xn )|| = ||A1Xn − F(X0 )||, we get 1Xn bian in the inner GMRES iteration. The results are listed in
5) Return to main program and calculate the revised value the Table 7.
of X : X1 = X0 + 1Xn
6) Check inequality ||F(X1 )|| < Tol, if it holds, then end TABLE 7. The results using GMRES (50) as inner iteration.
Compare with 2) GMRES (50), the time consuming pro- 2) For xi (0) = 5.0 after 15 iterations the Inf. Norm of
portion is 63/1504 ≈ 1/25. Furthermore, the accuracy is f = 0.2731 × 10−6
improved:
xi = 1.0000000000 i = 1, 2, . . . , 999996
0.403872 × 10−2 vs. 0.105555 × 10−1 . x999997 = 0.9999999776
D. GENERALIZED BROWN ALMOST LINEAR PROBLEMS x999998 = 1.0000002076
In [6] we solved a tough equation, Brown almost linear prob- x999999 = 0.9999995501
lem for N = 100. Can we enlarge the dimension N from 100 x1000000 = 1.0000000901
to 1,000,000? For the first (N − 1) linear equation, there is no
problem, the difficulty occurs in the last nonlinear equation For initial xi (0) = 50.0, we cannot make ||f || < 10−6 .
We set Tol1 = 1.0, Tol2 = 10−2 , Tol3 = 10−4 , h1 = 0.1,
QN
i=1 xi −1 = 0. We generalize this problem by the following
way: h2 = 0.2, h3 = 0.3.
For N = 1, 000, 000 After 11 iterations, the Inf. Norm of f = 0.8103 × 10−4
N
X xi = 0.9999999999 i = 1, 2, . . . , 999996
fi (X ) = xi + xi − (N + 1) = 0 i = 1, 2, . . . , N − 1.
x999997 = 0.9999999173 x999998 = 1.0000012498
i=1
100 x999999 = 0.9999943524 x1000000 = 1.0000075451.
Y
fN (X ) = xi×10000 − 1 = 0
i=1 V. CONCLUSION
In this paper we have shown the efficiency of our method by 5
The initial values are xi (0) = 0.5. The solution to be searched
examples. Especially for example 5, a non-symmetric linear
is x ∗ = (1, 1, . . . , 1)T .
equations with dimension N = 1,000,000, we got the exact
Set ε = 0.2 × 10−5 , Tol1 = 1.0, Tol2 = 10−3 ,
solution by only 9 function evaluations. As for example 4,
Tol3 = 10−5 , h1 = 0.0001, h2 = 0.05, h3 = 0.1.
it is large (dimension = 1,000,000) and is also a tough one
After 36 iterations, we have the following results:
(with a nonlinear 100th degree), we handled it without any
xi = 1.0000000000 i = 1, 2, . . . , N − 1 trouble. Examples 1, 2, and 3 are related to partial differ-
ential equations. This area is a main source for large scale
xN = 0.9999999803
systems problem. Many mathematical models of practical
Inf. Norm of f (X ) = 0.6060 × 10−5 . science and engineering problems were described by PDE, so
how to solve these equations efficiently is of very important
E. NONSYMMETRIC LINEAR PROBLEM significance. A general point of view is that the implicit
Consider the following nonsymmetric linear problem, dimen- methods are more efficient than explicit methods, but when
sion N = 1,000,000, linear equation Ax = B, here the system is very large, parallel computation should be taken.
The program of parallel algorithm of implicit method is hard
2 0.5 1 1 ··· 1 to be coded. So people turn their attention to developing new
1 2 0.5 1 ··· 1 explicit methods to satisfy the needs of parallel computation.
. . .
A=
.. .. ..
Our method is the simplest one among those explicit methods.
1 1 ··· 2
0.5 All the variables in formula appear in vector form and each
1 1 ··· 1 2 component of F(x) can be calculated independently.
In this paper we did not involve hyperbolic equation. We
B = (b1 , b2 , . . . , bN )T hope we can continue our research for this equation in the
bi = 1, 000, 000.5 i = 1, 2, . . . , N − 1. near future.
bN = 1, 000, 001.0
REFERENCES
Set ε = 0.2 × 10−5 . For initial values xi (0) = 0.5, 5. [1] J. D. Hoffman, Numerical Methods for Engineers and Scientists, 2nd ed.
Tol1 = 1.0, Tol2 = 10−3 , Tol3 = 10−6 , h1 = 0.1, h2 = 0.2, New York, NY, USA: McGraw-Hill, 1992.
h3 = 0.3. [2] A. A. Samarskii, Theory of Difference Schemes. New York, NY, USA:
Marcel Dekker, 2001.
We have the results: [3] D. V. Hutton, Fundamentals of Finite Element Analysis. New York,
1) For xi (0) = 0.5 after 9 iterations the Inf. Norm of NY, USA: McGraw-Hill, 2005.
f = 0.1693 × 10−6 [4] R. Eymard, T. R. Gallouët, and R. Herbin, ‘‘The finite volume method,’’
in Handbook of Numerical Analysis, vol. 7. Amsterdam, The Netherlands:
xi = 10000000000 i = 1, 2, . . . , 999997 North Holland, 2000, pp. 713–1020.
[5] R. J. Leveque, Finite Volume Methods for Hyperbolic Problems.
x999998 = 0.9999999938 Cambridge, U.K.: Cambridge Univ. Press, 2002.
x999999 = 1.0000000563 [6] T. Han and Y. Han, ‘‘Solving large scale nonlinear equation by a
new ODE numerical integration method,’’ Appl. Math., vol. 1, no. 3,
x1000000 = 0.9999998875 pp. 222–229, Sep. 2010.
[7] T. Han and Y. Han, ‘‘Solving implicit equation arising from adams-moulton Sciences, he has organized nationwide education programs for computational
methods,’’ BIT Numer. Math., vol. 42, no. 2, pp. 336–350, 2002. methods and popularized the computer applications in various fields. He was
[8] C. T. Kelley and L.-Z. Liao, ‘‘Explicit pseudo-transient continuation,’’ one of the two chief organizers of the National Computation Mathematics
Pacific J. Optim., vol. 9, pp. 77–91, Jan. 2013. Academy Conference in 1974 (well known as 748 Conference in China).
[9] J. D. Lambert, Computational Methods in Ordinary Differential Equations. This conference was of very important significance for the development in
New York, NY, USA: Wiley, 1973. the Chinese Computation Mathematical History. In 1985, he started working
[10] T. Han, X. Luo, and Y. Han, ‘‘Solving large scale unconstrained minimiza- for China Electric Power Research Institute, Beijing, teaching M.S. and Ph.D.
tion problems by a new ODE numerical integration method,’’ Appl. Math., students and continued his research. His main interests are various large scale
vol. 2, no. 5, pp. 527–532, May 2011. systems numerical methods such as linear and nonlinear equations, ordinary
[11] L. Adams and D. Xie, ‘‘New parallel SOR method by domain partitioning,’’ differential equations and partial differential equations initial and boundary
SIAM J. Sci. Comput., vol. 20, no. 6, pp. 2261–2281, 1999. problems as well as optimization.
[12] W. Niethammer, ‘‘The SOR method on paralleled computer,’’ Numer.
Math., vol. 56, nos. 2–3, pp. 247–254, 1989.
[13] D. J. Evens and A. R. B. Abdullah, ‘‘Group explicit method for parabolic
equations,’’ Int. J. Comput. Math., vol. 14, no. 1, pp. 73–105, 1983.
[14] D. J. Evens and A. R. B. Abdullah, ‘‘A new explicit method for
the diffusion-convection equations,’’ Comput. Math. Appl., vol. 11,
nos. 1–3, pp. 145–154, Jan./Mar. 1985.
[15] R. S. Varga, Matrix Iterative Analysis. Englewood Cliffs, NJ, USA:
Prentice-Hall, 1962. YUHUAN HAN received the B.S. degree in com-
[16] Y. Saad, Iterative Methods for Sparse Linear Systems. Boston, MA, USA: puter science from Beijing Polytechnic University,
PWS Publising company, 1996. Beijing, China, in 1993, and the M.S. degree in
[17] R. S. Dembo, S. C. Eisenstat, and T. Steihaug, ‘‘Inexact newton methods,’’ operations research and management science from
SIAM J. Numer. Anal., vol. 19, no. 2, pp. 400–408, 1982. George Mason University, Fairfax, VA, USA, in
1999.
He has been working for Fortune 500 companies
TIANMIN HAN received the Degree in mathe- for over 13 years with senior positions. He is the
matics from Lanzhou University, Gansu, China, in co-founder of Hedge Fund of America, LP, DE,
1961. He was the Chair of the Student Association USA, which is a registered company and a Morn-
with the same department. ingstar database listed company. He is the President of a private foundation,
He was with the China Academy of Sciences, registered in California, USA. He founded an online shopping search engine,
Beijing, China. His research fields are stiff dif- as a Chief Technical Officer. He was with the China Academy of Sciences,
ferential equations and ill-conditioned algebraic Beijing, China.
equations. The representative publication is enti-
tled ‘‘A Numerical Method for Solving Initial
Value Problems of Ordinary Differential Equa-
tions’’, SCIENTIA SINICA, Vol. XIX No. 2 pp. 180 ∼ 198, 1976
(SCIENTIA SINICA is the top one scientific journal in China). The
Abstract of this paper and a very positive review comments can be found at
<Mathematics Review> USA, Vol. 56 No. 4, 10016, page 1356, Oct. 1978 by
Prof. J. C. Butcher (Auckland). While working for China Academy of