0% found this document useful (0 votes)
5 views

HTM-2013-IEEE

Uploaded by

invest.ca
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

HTM-2013-IEEE

Uploaded by

invest.ca
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Received May 10, 2013, accepted August 18, 2013, date of publication August 30, 2013, date of current

version September 10, 2013.


Digital Object Identifier 10.1109/ACCESS.2013.2280244

Numerical Solution for Super Large Scale Systems


TIANMIN HAN1 AND YUHUAN HAN2
1 China Electric Power Research Institute, Beijing 100192, China
2 Hedge Fund of America, San Jose, CA 95133, USA
Corresponding author: T. Han and Y. Han (hantianmin@126.com; yuhuan.han@gmail.com)

ABSTRACT In this paper, a new ordinary differential equation numerical integration method is successfully
applied to various mathematical branches such as partial differential equation (PDE) boundary problems,
PDE initial-boundary problems, tough nonlinear equations, and so forth. The new method does not use
Jacobian, so it can handle very large systems, say the dimension N = 1 000 000, or even larger. In addition,
we give a very simple accelerating convergence approach for the linear algebraic equations arising from
linear PDE boundary problems. All the numerical results show that the new method is very promising for
super large scale systems.

INDEX TERMS Super large scale systems, numerical solution, ODE method, PDE boundary problem,
nonlinear equations, linear equations, inexact Newton method, GMRES(m), parallel computation.

I. INTRODUCTION the numerical solution of the Dirichlet problem:


In the literature of numerical calculation, if a system has
the dimension N = 10,000, it is usually called large scale −uxx − uyy = 0 0 < x, y < 1
system. In this paper we will handle the systems which have for the unit square R with boundary B subject to the bound-
the dimension N = 1,000,000, so we refer to it as ‘‘Super ary condition u(x, y) = f (x, y) ∈ B. Using uniform mesh
Large Scale System’’. These problems are created from many xi = ih, yj = jh, h = 1/N , we get the finite difference
mathematical branches, practical sciences, and engineering approximation:
areas. Among these sources, the numerical solution of partial
differential equation (PDE) is a typical representative. 1
uij − (ui+1,j + ui−1,j + ui,j+1 + ui,j−1 ) = h2 ri,j .
The discrete approximation for PDEs mainly has three 4
approaches: This is a linear equation for 1 ≤ i, j ≤ N − 1, it can be written
as
1) Finite difference method [1], [2]. This is the most tra-
ditional method, however, up to today it is still widely AU = K
used in various fields and a lot of parallel algorithms
have been developed. where A is (N −1)2 ×(N −1)2 matrix. For N = 30, 100, 1000,
2) Finite element method [3]. Compare with the finite the condition numbers of A are 364, 3000, 4 × 105 , respec-
difference method it is more general and powerful in tively.
its application to problems that involve complicated For super large scale nonlinear equations, the Jacobian is
physical geometry and boundary conditions. a 106 × 106 matrix. All the methods related to Jacobian will
3) Finite volume method. It is widely used in many com- meet severe difficulty. The fixed iteration method, which does
putational fluid dynamic problems [4], [5]. not relate to Jacobian, but as we pointed in [6], is equivalent
to an explicit ODE Euler method. However, for a large scale
In this paper, it is impossible to involve such a wide range stiff system, it still cannot work.
of subjects. We only study the nonlinear equations, formed All the explicit ODE methods, including the P(EC)m
by finite difference method, use our algorithm to show its scheme of implicit methods, only have a very small stability
capabilities. region, this feature strictly limits the step size within a very
When we confront a large system, probably at the same narrow interval, so none of them is suitable as a tool to solve
time, face an ill-conditioned system. For example, consider the nonlinear equations.

VOLUME 1, 2013 2169-3536 2013 IEEE 537


T. Han, Y. Han: Numerical Solution for Super Large Scale Systems

In [7] we developed a set of explicit ODE integration In order to strengthen comprehension for our method we
methods, among them the methods with order = 1,2, possess give a comparison as following: The existing pseudo transient
very large stability region, in the direction of the real axis, continuation methods can be written as
which can be extended to infinite point. This fact might raise
doubts, an explanation can be found in [8]. un+1 = un − (δn−1 I + F 0 (un ))−1 F(un )
Just based on this fact — explicit ODE method with very It is clear that when δn → 0 the algorithm is close to
large stability region — we can use it as a tool to handle a explicit Euler method, when δ → ∞ the algorithm is close to
variety of large scale systems: stiff equations, linear algebraic Newton method. Various algorithms adopt different strategies
equations, nonlinear equations, and so on. for controlling δn . All these algorithms require the solutions
of linear equations, so we refer them as implicit methods.
II. THE ALGORITHM DESCRIPTION Now let us see what happens in our method.
In [7] we gave a set of ODE numerical integration methods, Consider at the node point n+2. If ε = h, we have ω = 1/2
among them the 1-st order method was very suitable to solve and
nonlinear equations. In [6] a derivation of this method has
been given. u0n+2 = un+1 − Zn+1 = un − Zn+1 − Zn+1
In this paper we just give a description of this 1-st order = un − 2 ∗ ω(hF(u0n+1 ) + Zn )
method and describe how to apply this method to solve non- = un − Zn − hF(u0n+1 ) = u0n+1 − hF(u0n+1 )
linear equations.
For nonlinear equations F(x) = 0, we consider the differ- So we can get conclusion that for node point n + 2, the value
ential equations initial value problem: of u0n+2 is given exactly by the explicit Euler method. That is
 to say for ε = h our method is equivalent to the explicit Euler
Ẋ = F(X ) method. This fact has been proven by a lot numerical results.
X (0) = X0 When ε → ∞ our algorithm is close to the implicit
Our ODE integration method can be written as following: Euler method, which has a very large stability region. For

(0)
model equation ẋ = λx, the stability region includes the
 Xn+1 = Xn + Zn
 whole hλ-plane but a disk with center (1, 0) and radius 1. For
(0)
 Zn+1 = ω(F(Xn+1 ) + Zn ) this implicit method, the PEC scheme turns it into explicit
X
n+1 = Xn + Zn+1 method. So our method is named as EXPLICIT PSEUDO-
TRANSIENT CONTINUATION by C. T. Kelley and Li-Zhi
Here Z0 = hF(X0 ),  > 0 is a parameter, h is the step size,
Liao [8]. Adjusting parameter ε we can still get very large
ω = h/( + h).
stability region. So the ε plays a very important role for the
The basic idea using the ODE method to solve a nonlin-
stability. This is why our method differs from traditional ones:
ear equation is that the solution of the nonlinear equation
It is explicit, but has a very large stability region.
F(X ) = 0 can be interpreted as a steady or equilibrium point
As we pointed in section I, the 2-nd order method (Trape-
of the dynamical system Ẋ = F(X ).
zoidal rule), which had been developed in [7], also has a very
We know that nonlinear equations +F(X ) = 0 and
large stability region, but it is not L-stable [9] p. 236. How-
−F(X ) = 0 are equivalent, but for dynamical systems
ever the 1-st order implicit Euler method for model equation
Ẋ = +F(X ) and Ẋ = −F(X ) are entirely different. Only
∂ ∂ ẋ = λx, which can be expressed as xn+1 = 1/(1 + hλ)xn .
when ∂x (+F(X )) < 0 or ∂x (−F(X )) < 0 the dynamical
When hλ → ∞, 1/(1 + hλ) → 0, this means it is L-stable.
system can have equilibrium point. That is to say, the ODE
For large step size, xn+1 will be towards zero quickly, so we
method is allowed to solve F(X ) = 0, F(X ) must satisfy half-
can expect a fast convergence rate for our iterative process.
plane condition and must choose a proper sign ‘‘+’’ or ‘‘−’’.
Usually, in optimization area, ∂/∂xF(X ) is recognized to be
III. THE IMPLEMENTATION OF THE ALGORITHM AND
positive definite, so in [8] the algorithm was written as:
THE STEP CONTROL STRATEGY
For u0 = −F(u) u(0) = u0
So far we have not developed an adaptive program which can
n = 0; Z0 = hF(u0 ) ω = h+ h
automatically choose parameter  and step size h. Since our
u01 = u0 − Z0
method is able to use in many different numerical calculation
Compute F(u01 ) and ρ = kF(u01 )k
areas, each area should have a different adaptive program to
While ρ > τ do
satisfy the special needs.
Zn+1 = ω(F(u0n+1 ) + Zn ) For example, in optimization area the differential equation
un+1 = un − Zn+1 Ẋ = −∇f (X )
n←n+1
is said to have gradient structure. By chain rule we have
u0n+1 = un − zn
N N 
∂f dxi ∂f 2

Compute F(u0n+1 ) and ρ = kF(u0n+1 )k d X X
f (X ) = =− = −k∇f (X )k2 .
end while. dt ∂xi dt ∂xi
i=1 i=1

538 VOLUME 1, 2013


T. Han, Y. Han: Numerical Solution for Super Large Scale Systems

It demonstrates the analysis solution X (t) makes the Implicit methods are usually adopted for PDEs boundary and
f (X (t)) decrease in Euclidean norm as t increases. So many initial value problems. If Newton method is used, how to
existing adaptive programs ask to check if the inequality solve the linear equations will be an important subject. No
f (Xn+1 ) < f (Xn ) holds or not. In our method we take the matter direct or iterative methods, for moderate systems the
step size so large that it will produce very large local error, preconditioning skill is required, this will increase complexity
the numerical solution Xn may go far from analysis solution greatly.
X (tn ), so in our program, checking the inequality is not nec- The large or very large systems, evoke a rapid development
essary. of parallel algorithms. For dense matrix LU factorization,
Our experience is that if only the sequence of points {Xn } for symmetric matrix Cholesky factorization, various parallel
keeps within the attraction domain of the equilibrium point algorithms have been proposed. For parallel iterative meth-
X ∗ , the norm of F(X ) is allowed to have some extent of ods, there are Jacobi, Gauss-Seidel, SOR iteration ( [11], [12])
oscillation. Sometimes the oscillation may be very violent, as well as preconditioned conjugate gradient (PCG). Apart
the difference of the norm may reach as high as five mag- from these traditional methods’ parallel algorithms, the par-
nitude order, after a while the oscillation disappears, the allel multi-splitting iterative methods have been developed.
sequence {Xn } progresses towards the X ∗ while kF(X )k is Among all these methods, the Jacobi iteration is easiest to
decreasing. So we just need to take measure to avoid overflow implement, but it is recognized as less efficiency. We will see
and restrict amplitude of the oscillation within a limit, say, our method is powerful and is as simple as the Jacobi iteration
in the process we don’t allow kF(Xn+1 )k > 104 kF(Xn )k, for nonlinear equations. So it should be the most suitable for
if this case occurs, we reduce the step size by a proper parallel calculation among all nonlinear solvers.
proportion. In order to avoid the difficulty for solving linear equa-
For stiff ODEs not only the steady state but also the tran- tions, people turn their attention to the explicit PDEs algo-
sient process is to be concerned. So the strategy of step size rithm. Since 1980’s Evans and Abdullah published a series of
control is totally different from the strategy adopted by a explicit algorithms for parabolic equations (e.g. [13], [14]) to
nonlinear solver. It should control truncation error when you meet the requirement of parallel calculation.
want to increase the step size. Ours is even more suitable for parallel calculation, when
In the linear case in [10] we have discussed how to choose compared with all those methods and other existing methods.
the parameters  and h, and got some results. It can be utilized Here’s another thing which is worth to mention. For regular
as a reference for nonlinear case, but using these results, spatial domain the 1,2,3-dimension problems, our method has
we need to possess the knowledge of the spectrum of the the same simplicity, this unusual feature makes our method to
Jacobian. This may raise some new troubles, unless someone be very efficient for handling High-Spatial-Dimension prob-
who engaged in some special areas and have a deeper under- lems.
standing of the system.
As a general and simple strategy, we propose the following IV. THE RESULTS
way: Separate the calculation into several stages, at the begin- A. PDE BOUNDARY VALUE PROBLEM
ning, the kF(X )k is relatively large, so taking smaller step size It is well known, the nonlinear PDE Boundary Value Problem
h1 , in process of time, when kF(X )k < Tol1, we enlarge step is an important source of the large and super large scale non-
size h and repeat the above procedures until kF(X )k < Toli, linear equations. For example, a three space dimension PDE
the stopping condition satisfied for certain Toli. boundary value problem, if each spatial direction has 100
Another important thing is that we replace dynamical equa- mesh point, it will produce 1,000,000 dimension nonlinear
tion Ẋ = F(X ) by Ẋ = −D−1 (X )F(X ), here D(X ) is the equations. It will be a hard task for any existing method to
diagonal matrix of the Jacobian. If D(X ) is nonsingular, the handle it.
systems Ẋ = −D−1 (X )F(X ) and Ẋ = F(X ) have the same In order to show the capability of our method, we construct
equilibrium point. an artificial problem as our example 1:
Despite the Jacobian does not appear in our algorithm,
its property still has an important effect on the calculation. uxx + uyy + uzz − (y2 z2 + x 2 z2 + x 2 y2 )u2 /exyz = 0
Taking the measure, divided by D(X ), we can deliver the
following benefits: Obviously, u(x, y, z) = exyz satisfies the above equation. We
1) Concentrate the widely distributed eigenvalues of the use the 2nd order centered finite differences to form the dis-
Jacobian on a smaller region, which can speed up con- crete equation. On the boundary of the domain  = [0, 1]3 ,
vergence greatly. we take the values of exyz as the boundary value and mesh
2) Help us to choose the parameter . If the Jacobian of the width HB = 1.0/101.0. Then a boundary value problem
system is diagonal dominant, choosing  = 0.5 will be was formed, which comprises 1,000,000 dimension nonlinear
very reasonable. equations.
3) Avoid the trouble to determine the sign ‘‘+’’ or ‘‘−’’ in Because in our method the Jacobian is not needed, we
the front of the F(X ). No matter the F(X ) is negative or can directly use 3-dimension array U (102, 102, 102) to write
positive the D−1 (X )F(X ) is always positive. our code. The values of x = 0, 1, y = 0, 1, z = 0, 1 are

VOLUME 1, 2013 539


T. Han, Y. Han: Numerical Solution for Super Large Scale Systems

given by The only comparable method is the classical fixed point


iteration. As we pointed in [6], it is equivalent to explicit
U (1, J , K ) = 1.0, Euler method with step size h = 1.0. We test this method,
U (102, J , K ) = Exp(1.0 ∗ (J − 1) ∗ HB ∗ (K − 1) ∗ HB), the results show when kRk < 10−5 it needs 12,240
U (I , 1, K ) = 1.0, function evaluations (NFE = 12,240) and Max Error =
U (I , 102, K ) = Exp((I − 1) ∗ HB ∗ 1.0 ∗ (K − 1) ∗ HB), 0.3068 × 10−2 .
We attempt to enlarge step size h = 1.1, but overflow
U (I , J , 1) = 1.0, happens. From the data above NFE = 620, 12240, the expense
U (I , J , 102) = Exp((I − 1) ∗ HB ∗ (J − 1) ∗ HB ∗ 1.0) of our method is only 1/19 of the classical one.
If we want to get higher accuracy solution, using double
We use R(I , J , K ) to express discrete F(X ), which appears in
precision and smaller tolerance, say Tol = 10−9 , after 1753
our algorithm, then we get the discrete equation:
function evaluations (NFE = 1753), we have Max Error =
o = R(I , J , K ) = U (I − 1, J , K ) + U (I + 1, J , K ) 0.2486 × 10−6 .
+U (I , J − 1, K ) + U (I , J + 1, K ) + U (I , J , K − 1)
B. PDE INITIAL-BOUNDARY VALUE PROBLEM
+U (I , J , K + 1) − 6 ∗ U (I , J , K ) − HB2 For u(t, x, y, z), we consider the following equation:
∗(((J − 1) ∗ HB ∗ (K − 1) ∗ HB)2
∂u
+((I − 1) ∗ HB ∗ (K − 1) ∗ HB)2 = α(uxx + uyy + uzz )
∂t
+((I − 1) ∗ HB ∗ (J − 1) ∗ HB)2 ) ∗ U (I , J , K )2 /
Obviously u = 1 + e−αt (sin x + sin y + sin z) satisfies the
Exp((I − 1) ∗ HB ∗ (J − 1) ∗ HB ∗ (K − 1) ∗ HB)
above equation. In domain [0, π]3 , we construct an initial-
I , J , K = 2, 3, . . . 101. boundary value problem:
The diagonal entries of the Jacobian are composed of two at t = 0, u(0, x, y, z) = 1 + sin x + sin y + sin z
terms, one is -6, another contains the squares of HB. The latter
is relatively very small as compared with the former, so we at x = 0, π; u(t, 0, y, z) = u(t, π, y, z)
take (-6) as the diagonal entries to form the diagonal matrix D, = 1 + e−αt (sin y + sin z)
which is approximately the diagonal matrix of the Jacobian. y = 0, π; u(t, x, 0, z) = u(t, x, π, z)
The iteration initial values are U (I , J , K ) = 0.5, I , J , = 1 + e−αt (sin x + sin z)
K = 2, 3, . . . 101. If we take norm of R as our stopping
criteria, only the Inf. Norm = Max 1≤i≤N |Ri | is reasonable, z = 0, π; u(t, x, y, 0) = u(t, x, y, π)
since our system is so large, despite every |Ri | may be very = 1 + e−αt (sin x + sin y)
small, the Euclidean norm can be still very large.
Single precision is good enough for our calculation, as First, we discretize only the spatial variables, leaving time
compared with any direct method, for a moderate dimension variable continuous, thus we obtain an ODEs initial value
system, the double precision is necessary. problem.
After the iteration finished, we calculate the difference If 7-point approximation and 100 interior points in each
between analytic solution and our iteration solution, and refer space direction are used to discretize the (uxx + uyy + uzz ), the
to the maximum of the absolute values as a measure of mesh width HB = π/101.0, we get our example 2 a linear
approximation and denotes it by Max Error. constant coefficient ODEs initial value problem as following:
Considering the existence of discretization error, we take For I , J , K = 2, 3, . . . , 101
kRk < 10−5 as our stopping condition. At beginning d
kRk = 0.6496 × 101 . We separate the iterative process into u(t, I , J , K ) = α(u(t, I − 1, J , K ) + u(t, I + 1, J , K )
dt
six stages, for each stage using hi and Toli as the step size +u(t, I , J − 1, K ) + u(t, I , J + 1, K )
and terminating condition, the intermediate results can be +u(t, I , J , K − 1) + u(t, I , J , K + 1)
found in Table 1. Where NFE denotes the Number of Function
−6u(t, I , J , K ))/HB2
Evaluation.
Among them the boundary values are known functions:
TABLE 1. The results of example 1.  = 0.5.

u(t, 1, J , K ) = u(t, 102, J , K )


= 1 + e−αt (sin((J −1)HB) + sin((K −1)HB))
u(t, I , 1, K ) = u(t, I , 102, K )
= 1 + e−αt (sin((I −1)HB) + sin((K −1)HB))
u(t, I , J , 1) = u(t, I , J , 102)
= 1 + e−αt (sin((I −1)HB) + sin((J −1)HB))

540 VOLUME 1, 2013


T. Han, Y. Han: Numerical Solution for Super Large Scale Systems

The initial values: Obviously, U (x, y) = sin x sin y satisfies the equation. We
discretize the differential equation by normal 5-point formula
u(0, I , J , K ) = 1 + sin((I − 1)HB) with the uniform mesh width HB = π/501, then for the inte-
+ sin((J − 1)HB) + sin((K − 1)HB) rior mesh point ((I − 1)HB, (J − 1)HB)(I , J = 2, 3, . . . , 501)
According to [15] p. 269, using forward difference method the function values U (I , J ) = U ((I − 1)HB, (J − 1)HB)
(explicit Euler method) to integrate this equation, the step comprise a system of linear equations L(U (I , J )) = 0 with
size h should satisfy h ≤ HB2 /6α. This inequality means dimension = 250,000. The boundary point values are merged
the spectral radius ρ of the coefficient matrix is 2/HB2 /6α. into right hand items. In our case they are all equal to zero.
Based on the analysis of [10], our ODE method should choose The rest arguments are similar as those in the example 1.
 < (4/3)(1/2)(HB2 /6α), then for any h, 0 < h < ∞, our We omit them, just list the results of both algorithms in Table 4
method is stable. and 5. The iteration initial values are UI ,J = U (I , J ) =
For α = 10, the results of example 2 are listed in Table 2 0.5, [I , J = 2, 3, . . . , 501].
where we use the error of u(t, 52, 52, 52) as a representative to
give a general idea of the accuracy of our calculation. NFE has TABLE 4. The results of ODE method for example 3 ε = 0.5, initial
Inf. Norm of L(U (I, J)) is 0.100031 × 101 .
the same meaning as example 1. The interval of integration is
[0, 1], at the end point the error is 0.3128 × 10−4 .
For explicit Euler method if we take h = HB2 /6α, then
(ρh) will be at the margin of the stability region. In order to
obtain better stability we take h = 0.9 × HB2 /6α. The results
are listed in Table 3

TABLE 2. The results of our ODE method


 = 1.3 × 0.5 × HB2 /60 = 0.104814 × 10−4 .

TABLE 5. The results of GMRES (50) for example 3.

TABLE 3. The results of Euler method h = 0.145127 × 10−4 .

If we put the stopping condition Inf. Norm <1D-7, we get


more accurate results:
1) Our ODE method: Max Error = 0.114459 × 10−2
NFE = 2704, time consuming = 785
2) GMRES (50): Max Error = 0.103391 × 10−2
The average step size of our method is 1/1814 ≈ Number of restart = 152, time consuming = 2266
0.000551. It is about 0.000551/0.0000145127 ≈ 38 times Compare the four data of time consuming, 557/1504,
as large as Euler’s. The time consuming proportion is 785/2266 the proportion of both is about 1/3.
975/41904 ≈ 1/43. Despite our ODE method does not need Jacobian, too many
From Table 2 and 3 we can see that at the beginning of function evaluations are still a defect, especially when the
the integration, our method has a larger local error; at the end function F(X ) includes many elementary functions. For this
of the integration interval, our method has less accumulated example, which is just the case. The function F(X ) includes
error. This phenomenon shows our method can quickly get many elementary functions of sine and cosine, so the function
into a steady state. evaluation will be very time consuming. On the other hand the
Jacobian is very simple. Based on these two facts, we will
C. LINEAR PDE BOUNDARY VALUE PROBLEM give an accelerating convergence approach. This approach
In order to compare our ODE method with restarted is related to inexact Newton method [17]. It seems strange,
GMRES(m), [16] we consider the following 2-dimensional the inexact Newton method is used for nonlinear equations,
PDE boundary value problem in the domain [0, π]2 . and using any iterative method as a tool to solve the linear
equations.
−Uxx − Uyy + 0.1(Ux + Uy ) However, it does not matter! We recognize the linear equa-
= 2 sin x sin y + 0.1(cos x sin y + sin x cos y) tion AX = b as a nonlinear equation F(X ) = −AX + b = 0.

VOLUME 1, 2013 541


T. Han, Y. Han: Numerical Solution for Super Large Scale Systems

Then combining any iteration with inexact Newton method as The total NL = 9082
following: The time consuming = 268
1) Giving Newton iteration initial value X0 , calculating Max Error = 0.822352 × 10−5
F(X0 ) = −AX0 + b. We have known the Jacobian = From the data above, we can see that the linear iteration
−A, so we get the linear equation A1X = F(X0 ) number is 9082, it seems too large comparing with 1906
2) For the linear equation A1X = F(X0 ), put the iteration which is on the last row of Table 4, the result of non accel-
initial value 1X0 = 0.0 erating method, but the time proportion is 268/557. That is to
3) Write the initial residual norm ||R(1X0 )|| = ||A1X0 − say, despite our new approach needs more iterations, the work
F(X0 )|| = R0 . amount of each iteration is greatly less than non accelerating
4) Using any iteration method to solve 1X , the iter- method and can get a very accurate solution.
ation stopping condition is Rn /R0 < 0.1, here For GMRES (50), we still use simplified 5-line band Jaco-
Rn = ||R(1Xn )|| = ||A1Xn − F(X0 )||, we get 1Xn bian in the inner GMRES iteration. The results are listed in
5) Return to main program and calculate the revised value the Table 7.
of X : X1 = X0 + 1Xn
6) Check inequality ||F(X1 )|| < Tol, if it holds, then end TABLE 7. The results using GMRES (50) as inner iteration.

the iteration, otherwise X1 ⇒ X0 , GO TO 1).

TABLE 6. The accelerating converge process for linear equation by using


ODE method as an inner iterative method Initial Inf. Norm ||F (x)|| = 1.001.

Collect the data of the four algorithms: (stopping condition


is Inf. Norm of F(x) < 10−6 )
1) ODE Method:
The Number of iteration 1906
time consuming 557
Max Error = 0.438952 × 10−2
2) GMRES (50):
In this example, the Jacobian −A does not need to be
The restart Number 106
formed. In fact, we can neglect the entries which are related to
time consuming 1504
the 1-st order derivatives Ux , Uy . This is because all of them
Max Error = 0.105555 × 10−1
includes a factor HB. They are very small compared with
3) inexact NT (ODE) NT iteration 12
entries related to 2nd order Uxx , Uyy . After simplification, the
Time consuming 268
Jacobian is just a 5-line band matrix, with diagonal entries 4
Max Error = 0.822352 × 10−5
and off-diagonal entries −1. So every matrix-vector multipli-
4) inexact NT (GMRES (50)) NT iteration 10
cation only needs N scalar multiplications and 4N subtrac-
time consuming 1731
tions. In this example the ‘‘inexact’’ has dual meanings, one
Max Error = 0.112507 × 10−3
is inexact iterative accuracy, the other is inexact Jacobian.
For our ODE method, the step size h is determined by the From the data above the inexact (ODE) method performs
magnitude of ||F(X )||. The details are as following: best, but this is just one example, and can be used as a general
If ||F(X )|| > 1, then h = 2.5; 1 > ||F(X )|| > 0.1, h = 5.0; method with additional work in the near future.
0.1 > ||F(X )|| > 0.01, h = 10.0; 0.01 > ||F(X )|| > Compare with the results of 1) and 3), if we set the stopping
0.001, h = 20.0; condition as ||F|| < 10−4 , a further promising results are as
0.001 > ||F(X )|| > 0.0001, h = 40.0; 0.0001 > following:
||F(X )|| > 0.00001, h = 80.0; 5) inexact NT (ODE) (with Tol = 10−4 )
0.00001 > ||F(X )|| > 0.000001, h = 100.0. NT iteration 5
The results are listed in the Table 6. In the table, NT denotes time consuming 63
the number of outer Newton iteration, NL denotes Number of Max Error = 0.403872 × 10−2
inner Linear equation iteration, the accumulated numbers of That is to say, it has the same accuracy as 1), but the time
the inner iterations are in the brackets. consuming proportion is 63/557 ≈ 1/9.

542 VOLUME 1, 2013


T. Han, Y. Han: Numerical Solution for Super Large Scale Systems

Compare with 2) GMRES (50), the time consuming pro- 2) For xi (0) = 5.0 after 15 iterations the Inf. Norm of
portion is 63/1504 ≈ 1/25. Furthermore, the accuracy is f = 0.2731 × 10−6
improved:
xi = 1.0000000000 i = 1, 2, . . . , 999996
0.403872 × 10−2 vs. 0.105555 × 10−1 . x999997 = 0.9999999776
D. GENERALIZED BROWN ALMOST LINEAR PROBLEMS x999998 = 1.0000002076
In [6] we solved a tough equation, Brown almost linear prob- x999999 = 0.9999995501
lem for N = 100. Can we enlarge the dimension N from 100 x1000000 = 1.0000000901
to 1,000,000? For the first (N − 1) linear equation, there is no
problem, the difficulty occurs in the last nonlinear equation For initial xi (0) = 50.0, we cannot make ||f || < 10−6 .
We set Tol1 = 1.0, Tol2 = 10−2 , Tol3 = 10−4 , h1 = 0.1,
QN
i=1 xi −1 = 0. We generalize this problem by the following
way: h2 = 0.2, h3 = 0.3.
For N = 1, 000, 000 After 11 iterations, the Inf. Norm of f = 0.8103 × 10−4
N
X xi = 0.9999999999 i = 1, 2, . . . , 999996
fi (X ) = xi + xi − (N + 1) = 0 i = 1, 2, . . . , N − 1.
x999997 = 0.9999999173 x999998 = 1.0000012498
i=1
100 x999999 = 0.9999943524 x1000000 = 1.0000075451.
Y
fN (X ) = xi×10000 − 1 = 0
i=1 V. CONCLUSION
In this paper we have shown the efficiency of our method by 5
The initial values are xi (0) = 0.5. The solution to be searched
examples. Especially for example 5, a non-symmetric linear
is x ∗ = (1, 1, . . . , 1)T .
equations with dimension N = 1,000,000, we got the exact
Set ε = 0.2 × 10−5 , Tol1 = 1.0, Tol2 = 10−3 ,
solution by only 9 function evaluations. As for example 4,
Tol3 = 10−5 , h1 = 0.0001, h2 = 0.05, h3 = 0.1.
it is large (dimension = 1,000,000) and is also a tough one
After 36 iterations, we have the following results:
(with a nonlinear 100th degree), we handled it without any
xi = 1.0000000000 i = 1, 2, . . . , N − 1 trouble. Examples 1, 2, and 3 are related to partial differ-
ential equations. This area is a main source for large scale
xN = 0.9999999803
systems problem. Many mathematical models of practical
Inf. Norm of f (X ) = 0.6060 × 10−5 . science and engineering problems were described by PDE, so
how to solve these equations efficiently is of very important
E. NONSYMMETRIC LINEAR PROBLEM significance. A general point of view is that the implicit
Consider the following nonsymmetric linear problem, dimen- methods are more efficient than explicit methods, but when
sion N = 1,000,000, linear equation Ax = B, here the system is very large, parallel computation should be taken.
  The program of parallel algorithm of implicit method is hard
2 0.5 1 1 ··· 1 to be coded. So people turn their attention to developing new
1 2 0.5 1 ··· 1 explicit methods to satisfy the needs of parallel computation.
. . .
 
A=
 .. .. .. 
 Our method is the simplest one among those explicit methods.

1 1 ··· 2

0.5 All the variables in formula appear in vector form and each
1 1 ··· 1 2 component of F(x) can be calculated independently.
In this paper we did not involve hyperbolic equation. We
B = (b1 , b2 , . . . , bN )T hope we can continue our research for this equation in the
bi = 1, 000, 000.5 i = 1, 2, . . . , N − 1. near future.
bN = 1, 000, 001.0
REFERENCES
Set ε = 0.2 × 10−5 . For initial values xi (0) = 0.5, 5. [1] J. D. Hoffman, Numerical Methods for Engineers and Scientists, 2nd ed.
Tol1 = 1.0, Tol2 = 10−3 , Tol3 = 10−6 , h1 = 0.1, h2 = 0.2, New York, NY, USA: McGraw-Hill, 1992.
h3 = 0.3. [2] A. A. Samarskii, Theory of Difference Schemes. New York, NY, USA:
Marcel Dekker, 2001.
We have the results: [3] D. V. Hutton, Fundamentals of Finite Element Analysis. New York,
1) For xi (0) = 0.5 after 9 iterations the Inf. Norm of NY, USA: McGraw-Hill, 2005.
f = 0.1693 × 10−6 [4] R. Eymard, T. R. Gallouët, and R. Herbin, ‘‘The finite volume method,’’
in Handbook of Numerical Analysis, vol. 7. Amsterdam, The Netherlands:
xi = 10000000000 i = 1, 2, . . . , 999997 North Holland, 2000, pp. 713–1020.
[5] R. J. Leveque, Finite Volume Methods for Hyperbolic Problems.
x999998 = 0.9999999938 Cambridge, U.K.: Cambridge Univ. Press, 2002.
x999999 = 1.0000000563 [6] T. Han and Y. Han, ‘‘Solving large scale nonlinear equation by a
new ODE numerical integration method,’’ Appl. Math., vol. 1, no. 3,
x1000000 = 0.9999998875 pp. 222–229, Sep. 2010.

VOLUME 1, 2013 543


T. Han, Y. Han: Numerical Solution for Super Large Scale Systems

[7] T. Han and Y. Han, ‘‘Solving implicit equation arising from adams-moulton Sciences, he has organized nationwide education programs for computational
methods,’’ BIT Numer. Math., vol. 42, no. 2, pp. 336–350, 2002. methods and popularized the computer applications in various fields. He was
[8] C. T. Kelley and L.-Z. Liao, ‘‘Explicit pseudo-transient continuation,’’ one of the two chief organizers of the National Computation Mathematics
Pacific J. Optim., vol. 9, pp. 77–91, Jan. 2013. Academy Conference in 1974 (well known as 748 Conference in China).
[9] J. D. Lambert, Computational Methods in Ordinary Differential Equations. This conference was of very important significance for the development in
New York, NY, USA: Wiley, 1973. the Chinese Computation Mathematical History. In 1985, he started working
[10] T. Han, X. Luo, and Y. Han, ‘‘Solving large scale unconstrained minimiza- for China Electric Power Research Institute, Beijing, teaching M.S. and Ph.D.
tion problems by a new ODE numerical integration method,’’ Appl. Math., students and continued his research. His main interests are various large scale
vol. 2, no. 5, pp. 527–532, May 2011. systems numerical methods such as linear and nonlinear equations, ordinary
[11] L. Adams and D. Xie, ‘‘New parallel SOR method by domain partitioning,’’ differential equations and partial differential equations initial and boundary
SIAM J. Sci. Comput., vol. 20, no. 6, pp. 2261–2281, 1999. problems as well as optimization.
[12] W. Niethammer, ‘‘The SOR method on paralleled computer,’’ Numer.
Math., vol. 56, nos. 2–3, pp. 247–254, 1989.
[13] D. J. Evens and A. R. B. Abdullah, ‘‘Group explicit method for parabolic
equations,’’ Int. J. Comput. Math., vol. 14, no. 1, pp. 73–105, 1983.
[14] D. J. Evens and A. R. B. Abdullah, ‘‘A new explicit method for
the diffusion-convection equations,’’ Comput. Math. Appl., vol. 11,
nos. 1–3, pp. 145–154, Jan./Mar. 1985.
[15] R. S. Varga, Matrix Iterative Analysis. Englewood Cliffs, NJ, USA:
Prentice-Hall, 1962. YUHUAN HAN received the B.S. degree in com-
[16] Y. Saad, Iterative Methods for Sparse Linear Systems. Boston, MA, USA: puter science from Beijing Polytechnic University,
PWS Publising company, 1996. Beijing, China, in 1993, and the M.S. degree in
[17] R. S. Dembo, S. C. Eisenstat, and T. Steihaug, ‘‘Inexact newton methods,’’ operations research and management science from
SIAM J. Numer. Anal., vol. 19, no. 2, pp. 400–408, 1982. George Mason University, Fairfax, VA, USA, in
1999.
He has been working for Fortune 500 companies
TIANMIN HAN received the Degree in mathe- for over 13 years with senior positions. He is the
matics from Lanzhou University, Gansu, China, in co-founder of Hedge Fund of America, LP, DE,
1961. He was the Chair of the Student Association USA, which is a registered company and a Morn-
with the same department. ingstar database listed company. He is the President of a private foundation,
He was with the China Academy of Sciences, registered in California, USA. He founded an online shopping search engine,
Beijing, China. His research fields are stiff dif- as a Chief Technical Officer. He was with the China Academy of Sciences,
ferential equations and ill-conditioned algebraic Beijing, China.
equations. The representative publication is enti-
tled ‘‘A Numerical Method for Solving Initial
Value Problems of Ordinary Differential Equa-
tions’’, SCIENTIA SINICA, Vol. XIX No. 2 pp. 180 ∼ 198, 1976
(SCIENTIA SINICA is the top one scientific journal in China). The
Abstract of this paper and a very positive review comments can be found at
<Mathematics Review> USA, Vol. 56 No. 4, 10016, page 1356, Oct. 1978 by
Prof. J. C. Butcher (Auckland). While working for China Academy of

544 VOLUME 1, 2013

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy