0% found this document useful (0 votes)

21 views15 pages

Vimp Paper

This document discusses using artificial neural networks to solve parametric partial differential equations (PDEs) with random coefficients. It proposes training a neural network to learn the mapping from random coefficient fields to physical quantities derived from solving the PDE. This avoids having to solve the PDE exponentially many times. The approach is demonstrated on two example PDE problems: elliptic homogenization and the nonlinear Schrodinger eigenvalue problem. The neural network provides a low-dimensional representation that depends on only a few features of the random coefficient fields.

Uploaded by

DSYMEC224Trupti Bagal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views15 pages

Vimp Paper

Uploaded by

DSYMEC224Trupti Bagal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Euro.

Jnl of Applied Mathematics: page 1 of 15

c The Author(s), 2020. Published by Cambridge University Press.1
doi:10.1017/S0956792520000182

Solving parametric PDE problems with

artificial neural networks
Y U E H A W K H O O1 , J I A N F E N G L U2 and L E X I N G Y I N G3
1 Department
of Statistics, University of Chicago, IL 60615, USA
email: ykhoo@uchicago.edu
2 Department of Mathematics, Department of Chemistry and Department of Physics, Duke University, Durham, NC

27708, USA
email: jianfeng@math.duke.edu
3 Department of Mathematics and ICME, Stanford University, Stanford, CA 94305, USA

email: lexing@stanford.edu

(Received 30 August 2019; revised 21 April 2020; accepted 27 May 2020)

The curse of dimensionality is commonly encountered in numerical partial differential equations

(PDE), especially when uncertainties have to be modelled into the equations as random coefficients.
However, very often the variability of physical quantities derived from PDE can be captured by a
few features on the space of the coefficient fields. Based on such observation, we propose using
neural network to parameterise the physical quantity of interest as a function of input coefficients.
The representability of such quantity using a neural network can be justified by viewing the neural
network as performing time evolution to find the solutions to the PDE. We further demonstrate
the simplicity and accuracy of the approach through notable examples of PDEs in engineering and
physics.

Key words: Neural-network, parametric PDE, uncertainty quantification

2020 Mathematics Subject Classification: 65Nxx

1 Introduction
Uncertainty quantifications in physical and engineering applications often involve the study
of partial differential equations (PDE) with random fields of coefficients. To understand the
behaviour of a system in the presence of uncertainties, one can extract PDE-derived physical
quantities as functionals of the coefficient fields. This can potentially require solving the PDE
exponential number of times numerically even with suitable discretisation of the PDE domain
and of the range of random variables. Fortunately in most PDE applications, often these func-
tionals depend only on a few characteristic ‘features’ of the coefficient fields, allowing them to
be determined from solving the PDE a limited number of times.
A commonly used approach for uncertainty quantifications is Monte-Carlo sampling. An
ensemble of solutions is built by repeatedly solving the PDE with different realisations. Then
physical quantities of interest, for example, the mean of the solution at a given location, can
be computed from the ensemble of solutions. Although being applicable in many situations, the

computed quantity is inherently noisy. Moreover, it lacks the ability to obtain new solutions
if they are not sampled previously. Other approaches exploit the low underlying dimensional-
ity assumption in a more direct manner. For example, the stochastic Galerkin method [13, 16]
expands the random solution using certain prefixed basis functions (i.e. polynomial chaos
[18, 19]) on the space of random variables, thereby reducing the high-dimensional problem to
a few deterministic PDEs. Such type of methods requires careful treatment of the uncertainty
distributions, and since the basis used is problem independent, the method could be expensive
when the dimensionality of the random variables is high. There are data-driven approaches
for basis learning such as applying Karhunen–Loève expansion to PDE solutions from dif-
ferent realisations of the PDE [3]. Similar to the related principal component analysis, such
linear dimension-reduction techniques may not fully exploit the nonlinear interplay between
the random variables. At the end of day, the problem of uncertainty quantification is one of
characterising the low-dimensional structure of the coefficient field that gives the observed
quantities.
On the other hand, the problem of dimensionality reduction has been central to the fields of
statistics and machine learning. The fundamental task of regression seeks to find a function hθ
parameterised by a parameter vector θ ∈ Rp such that

f (a) ≈ hθ (a), a ∈ Rn . (1.1)

However, choosing a sufficiently large class of approximation functions without the issue
of over-fitting remains a delicate business. As an example, in linear regression, the standard
procedure is to fix a set of basis (or feature maps) {φk (a)} such that

f (a) = βk φk (a) (1.2)
k

and determine the parameter βk ’s from sampled data. The choice of basis is important to the
quality of regression, just as in the case of studying PDEs with random coefficients. Recently,
deep neural networks have demonstrated unprecedented success in solving a variety of diffi-
cult regression problems related to pattern recognitions [8, 11, 15]. A key advantage of neural
network is that it bypasses the traditional need to handcraft basis for spanning f (a), but instead
learns the optimal basis that satisfies (1.1) directly from data. The performance of neural network
in machine learning applications, and more recently in physical applications such as represent-
ing quantum many-body states (e.g. [17, 2]), prompts us to study its use in the context of solving
PDE with random coefficients. More precisely, we want to learn f (a) that maps the coefficient
vectors a in a PDE to some physical quantity described by the PDE.
Our approach to solve for quantities arise from PDE with random coefficients consists of the
following simple steps:

• Sample the random coefficients (a in (1.1)) of the PDE from a user-specified distribution. For
each set of coefficients, solve the deterministic PDE to obtain the physical quantity of interest
(f (a) in (1.1)).
• Use a neural network as the surrogate model hθ (a) in (1.1) and train it using the previously
obtained samples.
• Validate the surrogate forward model with more samples. The neural network is now ready
for applications.

Though being a simple method, to the best of our knowledge, dimension reduction based on neu-
ral network representation has not been adapted to solving PDE with uncertainties. We consider
two simple but representative parametric PDE tasks in this work: elliptic homogenisation and
nonlinear Schrödinger eigenvalue problem. The main contributions of our work are as follows:

• We provide theoretical guarantees on neural network representation of f (a) through explicit

construction for the parametric PDE problems under study;
• We show that even a rather simple neural network architecture can learn a good representation
of f (a) through training.

We note that our work is different from [10, 14, 12, 9], which solve deterministic PDE numer-
ically using a neural network. The goal of these works is to parameterise the solution of a
deterministic PDE using neural network and replace Galerkin-type methods when performing
model reduction. It is also different from [6] where a deterministic PDE is solved as a stochastic
control problem using neural network. In this paper, the function that we want to parameterise is
over the coefficient field of the PDE.
The advantages of having an explicitly parameterised approximation to f (a) are numerous,
which we will only list a couple here. First, the neural network parameterised function can serve
as a surrogate forward model for generating samples cheaply for statistical analysis. Second, the
task of optimising some function of the physical quantity with respect to the PDE coefficients can
be done with the help of a gradient calculated from the neural network. To summarise, obtaining
a neural network parametrisation could limit the use of expensive PDE solvers in applications.
We demonstrate the success of neural network in two PDE applications. In particular, we
consider solving for the effective conductance in inhomogeneous media and the ground state
energy of a nonlinear Schrödinger equation (NLSE) having inhomogeneous potential. These are
important physical models with wide applications in physics and engineering.
In Section 2, we provide background on the two PDEs of interest. In Section 3, the theoretical
justification of using neural network (NN) to represent the physical quantities derived from the
PDEs introduced in Section 2 is provided. In Section 4, we describe the neural network architec-
ture for handling these PDE problems and report the numerical results. We finally conclude in
Section 5.

2 Two examples of parametric PDE problems

This section introduces the two PDE models – the linear elliptic equation and the NLSE – we
want to solve for. We focus on the map from the coefficient field of these equations to certain
physical quantities of interest. In both cases, the boundary condition is taken to be periodic for
simplicity.

2.1 Effective coefficients for inhomogeneous elliptic equation

Our first example will be finding the effective conductance in a non-homogeneous media. For
this, we consider the elliptic equation

∇ · a(x)(∇u(x) + ξ ) = 0, x ∈ [0, 1]d (2.1)

with periodic boundary condition where ξ ∈ Rd , ξ 22 = 1 ( · 2 is the Euclidean norm). To

ensure ellipticity, we consider the class of coefficient functions

A = {a ∈ L∞ ([0, 1]d ) | λ1 ≥ a ≥ λ0 > 0}. (2.2)

For a given ξ , we want to obtain the effective conductance functional Aeff : A → R defined by

Aeff (a) = a(x)∇ua (x) + ξ 22 dx = − ua (x)∇ · a(x)(∇ua (x) + 2ξ ) − a(x) dx, (2.3)
[0,1]d [0,1]d

where ua satisfies (2.1) (the subscript ‘a’ in ua is used to denote the dependence of the solution
of (2.1) on the coefficient field a). The second equality follows from integration by parts.
In practice, to parameterise Aeff as a function of the coefficient field a(x), we discretise the
domain using a uniform grid with step size h and grid points xi = ih, where the multi-index i ∈
{(i1 , . . . , id )}, i1 , . . . , id = 1, . . . , n with n = 1/h. In this way, we can think about the coefficient
field a(x) and the solution u(x) evaluated on the grid points both as vectors with length nd . More
precisely, let the action of Laplace operator on u (the term ∇ · a(x)∇u(x)) be discretised using
central difference as
d 1
(a + ai )(ui+ek − ui ) + 12 (ai−ek + ai )(ui−ek − ui )
2 i+ek
(2.4)
k=1
h2

for each i, where {ek }dk=1 denotes the canonical basis in Rd . Here ai := a(i/n) and ui := u(i/n).
Then the discrete version of (2.1) is obtained as

(La u + bξ ,a )i :=
d
(ai+ek + ai )ui+ek + (ai + ai−ek )ui−ek − (ai−ek + 2ai + ai+ek )ui
k=1
2h2
d
ξk (ai+ek − ai−ek )
+ = 0, ∀i, (2.5)
k=1
2h

where the first equality gives the definitions of La and bξ ,a . The discrete version of effective
conductance is obtained as 2E(ua ; a), where
1 1
E(u; a) = − u La u − u bξ ,a + d a 1, (2.6)
2 2n
and 1 is the all-one vector.

2.2 NLSE with random potential

For the second example, we want to find the ground state energy E0 of a NLSE with potential
a(x):

−u(x) + a(x)u(x) + σ u(x)3 = E0 u(x), x ∈ [0, 1]d (2.7)

subject to the normalisation constraint

u(x)2 dx = 1. (2.8)
[0,1]d

We take σ = 2 in this work and thus consider a defocusing cubic Schrödinger equation, which
can be understood as a model for soliton in nonlinear photonics or Bose–Einstein condensate
with inhomogeneous media. Similar to (2.5), we solve the discretised version of the NLSE

where
d
ui+ek + ui−ek − 2ui
L := . (2.10)
k=1
h2

Due to the nonlinear cubic term, it is more difficult to solve for the NLSE numerically compared
to (2.1). Therefore in this case, the value of having a surrogate model of E0 as a function of a is
more significant. We note that the solution u to (2.9) (and thus E0 ) can also be obtained from a
variational problem
σ 4
min −u Lu + u diag(a)u + u , (2.11)
u: u22 =nd 2 i i

where the diag(·) operator forms a diagonal matrix given a vector.

3 Theoretical justification of deep neural network representation

The physical quantities introduced in Section 2 are determined through the solution of the PDEs
given the coefficient field. Rather than solving the PDE, we will prove that the map from coef-
ficient field to such quantities can be represented using convolutional NNs. The main idea is to
view the solution u of the PDE as being obtained via time evolution, where each layer of the
NN corresponds to the solution at discrete time step. We focus here the case of solving elliptic
equation with inhomogeneous coefficients. Similar line of reasoning can be used to demonstrate
the representability of the ground state-energy E0 as a function of a using an NN.

Theorem 1 Fix an error tolerance > 0, there exists a neural network hθ (·) with O(nd )
hidden nodes per-layer and O(n2/ ) layers such that for any a ∈ A = {a ∈ L∞ ([0, 1]d ) | a(x) ∈
[λ0 , λ1 ], ∀x, λ0 > 0}, we have

|hθ (a) − Aeff (a)| ≤ λ1 . (3.1)

The proof of Theorem 1 is given in Appendix A. Note that due to the ellipticity assumption a ∈
A, the effective conductivity is bounded from below by Aeff (a) ≥ λ0 > 0. Therefore the theorem
immediately implies a relative error bound
|hθ (a) − Aeff (a)| λ1
≤ . (3.2)
Aeff (a) λ0
We illustrate the main idea of the proof in the rest of the section, and the technical details of
the proof are deferred to the supplementary materials.
First observe that the effective coefficient obeys a variational principle

Aeff (a) = 2 min E(u; a), (3.3)

where La , bξ ,a are defined in (2.5). Therefore, to get Aeff (a), we may minimise E(u; a) over the
solution space u, using e.g. steepest descent:
∂E(um ; a)
um+1 = um − t
∂u (3.4)
= um + t La um + bξ ,a ,
Downloaded from https://www.cambridge.org/core. Caltech Library, on 04 Jul 2020 at 06:12:05, subject to the Cambridge Core terms of use, available at
https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0956792520000182
6 Y. Khoo et al.

where t is a step size chosen sufficiently small to ensure descent of the energy. Note that the
optimisation problem is convex due to the ellipticity assumption (2.2) of the coefficient field
a (which ensures −u La u > 0 except for u = 1) with Lipschitz continuous gradient; therefore,
the iterative scheme converges to the minimiser with proper choice of step size for any initial
condition. Thus we can choose u0 = 0.
Now we can identify the iteration scheme with a convolutional NN architecture by viewing m
d
as an index of the NN layers. The input of the NN is the vector a ∈ Rn , and the hidden layers are
used to map between the consecutive pairs of (d + 1)-tensors Uim0 i1 ...id and Uim+1
0 i1 ...id
. The zeroth
dimension for each tensor U m is the channel dimension and the last d dimensions are the spatial
dimensions. If we let the channels in each U m be consisted of a copy of a and a copy of um , e.g. let

U0im1 ...id = a(i1 ,...,id ) , U1im1 ...id = um

(i1 ,...,id ) , (3.5)

in light of (3.4) and (2.5), one simply needs to perform local convolution (to aggregate a locally)
and nonlinearity (to approximate quadratic form of a and um ) to get from U1im1 ...id = um (i1 ,...,id )
to U1im+1
1 ...id
= um+1
(i1 ,...,id ) ; while the 0-channel is simply copied to carry along the information of
a. Stopping at m = M layer and letting uM be the approximate minimiser of E(u; a), based on
(3.3), we obtain an approximation to Aeff (a). Note that the architecture of NN used in the proof
resembles a deep ResNet [7], as the coefficient field a is passed from the first to the last layer.
The detailed estimates of the approximation error and the number of parameters will be deferred
to the supplementary materials.
Let us point out that if we take the continuum time limit of the steepest descent dynamics, we
obtain a system of ODE

∂t u = La u + bξ ,a , (3.6)

which can be viewed as a spatial discretisation of a PDE. Thus our construction of the neural
network in the proof is also related to the work [12] where multiple layers of convolutional NN
are used to learn and solve evolutionary PDEs. However, the goal of the neural network here is to
approximate the physical quantity of interest as a functional of the (high-dimensional) coefficient
field, which is quite different from the view point of [12].
We also remark that the number of layers of the NN required by Theorem 1 is rather large. This
is due to the choice of the (unconditioned) steepest descent algorithm as the engine of optimisa-
tion to generate the neural network architecture used in the proof. With a better preconditioner
such as the algebraic multigrid [20], we can effectively reduce the number of layers to O(1) and
thus achieves an optimal count of parameters involved in the NN. In practice, as shown in the
next section by actual training of parametric PDEs, the neural network architecture can be much
simplified while maintaining good approximation to the quantity of interest.

4 Proposed network architecture and numerical results

In this section, based on the discussion in Section 3, we propose using convolutional NN to
approximate the physical quantities given by the PDE with a periodic boundary condition. The
architecture of the neural network is described in Section 4.1, and then the implementation details
and numerical results are provided in Sections 4.2 and 4.3 respectively.

FIGURE 1. Construction of the NN in the proof of Theorem 1. The NN takes the coefficient field a as
an input and the convolutional layers are used to map from um , a to um+1 , a, m = 0, M − 1. At uM , local
convolutions and nonlinearity are used to obtain E(uM ; a).

Inputs 2@ Outputs
2@n×n 2n–1 × 2n–1 a @n×n a @n×n a @n×n 1@n×n

FIGURE 2. The map between (ut , a) and ut+1 .

Inputs 2@
2@n×n 2n–1 × 2n–1 a @n×n a @n×n a @n×n a @1×1 a @1×1 Outputs

FIGURE 3. The map between (uM , a) and E.

4.1 Architectures
In this subsection, we present two different architectures. Since we have shown theoretically that
the NN proposed in Figure 1 can approximate the effective conductance function, the NN in
Figure 1 will serve as a basis for the first architecture. In the second architecture, we simply use
an NN that incorporates translational symmetry. For most of the numerical tests in subsequent
sections, we report results using the second architecture since its simplicity demonstrates the
effectiveness of an NN. However, results for the first NN architecture are also given in order to
provide numerical evidence for Theorem 1 for the case of determining the effective conductance.
The first architecture is based on a ResNet [7] where the construction of NN is illustrated in
Figure 1. In Figures 2 and 3, the explicit maps between (ut , a) and ut+1 , and between (uM , a)
and E are detailed respectively. For the sake of illustration, we assume a 2D unit square domain,
though it can be generalised to solving PDEs in any dimensions. The input to the NN is a matrix
a ∈ Rn×n representing the coefficient field on grid points, and the output of the network gives
physical quantity of interest from the PDE. Based on (3.4) and (2.5), the cubic function that

FIGURE 4. Single convolutional layer neural network for representing translational invariant function.

maps from (ut , a) to ut+1 is realised via a few convolutional layers as in Figure 2. Since the
function E(uM ; a) has translational symmetry in the sense that

(i+τ1 )(j+τ2 ) ; a(i+τ1 )(j+τ2 ) ) = E(u ; a)

E(uM M
(4.1)

where the additions are done on Zn , a sum-pooling is used in Figure 3 to preserve the translational
symmetry of the function E.
Figure 4 shows the second architecture for the 2D case. When designing this architecture, we
forgo the PDE knowledge used to prove Theorem 1, and simply used the fact that

f (a) := E(ua ; a) (4.2)

has translational symmetry in terms of the input a. Again, the main part of the network is con-
volutional layers with ReLU being the nonlinearity. This extracts the relevant features of the
coefficient field around each grid point that contribute to the final output. The use of a sum-
pooling followed by a linear map to obtain the final output is again based on the translational
τ τ
symmetry of the function f to be represented. More precisely, let aij1 2 := a(i+τ1 )(j+τ2 ) , where the
additions are done on Zn . The output of the convolutional layer gives basis functions that satisfy

φ̃kij (aτ1 τ2 ) = φ̃k(i−τ1 )(j−τ2 ) (a), ∀(τ1 , τ2 ) ∈ {1, . . . , n}2 . (4.3)

When using the architecture in Figure 4, for any τ1 τ2 ,

α n n α
n
n
τ1 τ2
f (a ) = βk φ̃k(i−τ1 )(j−τ2 ) (a) = βk φk (a), φk := φ̃kij , (4.4)
k=1 i=1 j=1 k=1 i=1 j=1

where βk ’s are the weights of the last densely connected layer. Therefore the translational
symmetry of f is preserved.
We note that all operations in Figures 3 and 4 are standard except the padding operation.
Typically, zero-padding is used to enlarge the size of the input in image classification task,
whereas we extend the input periodically due to the assumed periodic boundary condition.

4.2 Implementation
The neural network is implemented using Keras [4], an application programming interface
running on top of TensorFlow [1] (a library of toolboxes for training neural network). A mean-
squared-error loss function is used and the optimisation is done using the NAdam optimiser [5].

Downloaded from https://www.cambridge.org/core. Caltech Library, on 04 Jul 2020 at 06:12:05, subject to the Cambridge Core terms of use, available at
https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0956792520000182
Neural networks for parametric PDE 9
Table 1. Error in approximating the effective conductance function Aeff (a) in 2D using the archi-
tecture in Figure 1. The definition of α is given in Figure 1. We test the NN with data model 1 and
2, where a1 , . . . , an2 are generated from independent and correlated random field respectively.
The mean and standard deviation of the effective conductance are computed from the samples
in order to show the variability. The sample sizes for training and validation are the same.

Data Training Validation Average No. of No. of

model n α error error Aeff samples param.

1 8 10 2.0e-3 1.8e-3 1.58 ± 0.10 1.2e + 4 4416

1 16 10 1.5e-3 1.4e-3 1.58 ± 0.052 2.4e + 4 4416
2 8 10 1.2-3 1.2e-3 4.87 ± 1.00 1.2e + 4 4416
2 16 10 6.8-4 6.7e-4 4.97 ± 1.01 2.4e + 4 4416

The hyper-parameter we tune is the learning rate, which we lower if the training error fluctu-
ates too much. The weights are initialised randomly from the normal distribution. The input to
the neural network is whitened to have unit variance and zero-mean on each dimension. The
mini-batch size is always set between 50 and 200.

4.3 Numerical examples

4.3.1 Effective conductance
For the case of effective conductance, we assume the following distributions for the input data:

1. The independent and identical random variables ai , i = 1, . . . , nd are distributed according

to U [0.3, 3] where U [λ0 , λ1 ] denotes the uniform distribution on the interval [λ0 , λ1 ].
d d
2. Correlated random variables ai , i = 1, . . . , nd with covariance matrix PPT ∈ Rn ×n :
d
a = Pb ∈ Rn , (4.5)

b1 , . . . , bnd are independently and identically distributed as U [0.3, 5], and the covariance
matrix is defined by

xi − xj 2
[PPT ]ij = exp − , i, j = 1, . . . , nd . (4.6)
(2h)2
The results of learning the effective conductance function are presented in Table 1. We use the
same number of samples for training and validation. Both the training and validation error are
measured by

k (hθ (a ) − Aeff (a ))
k k 2
k 2
, (4.7)
k Aeff (a )

where ak ’s can be either the training or validation samples sampled from the same distribution
and hθ is the neural network-parameterised approximation function. For this experiment in d = 2,
we report the results using two different architectures. In Table 1, the results for the architecture
in Figure 1 with M = 5 provide numerical evidence for Theorem 1. In Table 2, the more eco-
nomical NN in Figure 5 is used to demonstrate that by simply considering the symmetry in the

Data Training Validation Average No. of No. of

model n α error error Aeff samples param.

1 8 16 1.5e-3 1.4e-3 1.58 ± 0.10 1.2e + 4 1057

1 16 16 2.2e-3 1.8e-3 1.58 ± 0.052 2.4e + 4 4129
2 8 16 1.0-3 1.0e-3 4.87 ± 1.00 1.2e + 4 1057
2 16 16 2.5-3 2.5e-3 4.97 ± 1.01 2.4e + 4 4129

FIGURE 5. Neural network architecture for approximating Aeff (a) in the 1D case. Although the layers in
third stage are essentially densely connected layers, we still identify them as convolution layers to reflect
the symmetry between the first and third stages.

function to be approximated (without using PDE knowledge), results with relatively good accu-
racy can already be obtained. The simple NN in Figure 5 gives similar results as the deep NN in
Figure 1, indicating we might be able to further reduce the complexity of the NN in Theorem 1
via a more careful construct.
Before concluding this subsection, we use the exercise of determining the effective conduc-
tance in 1D to provide another motivation for the usage of a neural network. In 1D the effective
conductance can be expressed analytically as the harmonic mean of ai ’s:

1 −1
n
1
Aeff (a) = . (4.8)
n i=1 ai

This function indeed approximately corresponds to the deep neural network shown in Figure 5.
The neural network is separated into three stages. In the first stage, the approximation to function
1/ai is constructed for each ai by applying a few convolution layers with size 1 kernel window.
In this stage, the channel size for these convolution layers is chosen to be 16 except the last layer
since the output of the first stage should be a vector of size n. In the second stage, a layer of sum-
pooling with size n window is used to perform the summation in (4.8), giving a scalar output.

Training Validation No. of No. of

n α error error Average E0 samples param.

8 5 4.9 × 10−4 5.0 × 10−4 10.48 ± 0.51 4800 331

16 5 1.5 × 10−4 1.5 × 10−4 10.46 ± 0.27 1.05 × 104 1291

FIGURE 6. The first stage’s output of the neural network in Figure 5 fitted by β1 /x + β2 .

The third and first stages have the exact same architecture except the input to the third stage is a
scalar. For training 2560 samples are used and another 2560 samples are used for validation. We
let ai ∼ U [0.3, 1.5], giving an effective conductance of 0.77 ± 0.13 for n = 8. A validation error
of 4.9 × 10−4 is obtained with the neural network in Figure 5, while with the network in Figure 4
the accuracy is 5.5 × 10−3 with α = 16. As a sanity check, Figure 6 shows that the output from
the first stage is well fitted by the reciprocal function.
We remark that although incorporating domain knowledge in PDE to build a sophisticated
neural network architecture would likely boost the approximation quality, such as what we do
in the constructive proof for Theorem 1, even a simple network as in Figure 4 can already give
decent results.

4.3.2 Ground state energy of NLSE

We next focus on the 2D case in the NLSE example. The goal here is to obtain a neural network
2
parametrisation for E0 (a), with input now being a ∈ Rn with i.i.d. entries distributed according to
U [1, 16]. In order to generate training samples, for each realisation of a, the nonlinear eigenvalue
problem (2.9) subject to the normalisation constraint (2.8) is solved by a homotopy method. First,
the case σ = 0 is solved as a standard eigenvalue problem. Then σ is changed from 0 to 2 with
a step size equal to 0.4. For each σ , Newton’s method is used to solve the NLSE for ua (x) and

E0 (a), using ua (x) and E0 (a) corresponding to the previous σ value as initialisation. The results
are presented in Table 3.

5 Conclusion
In this note, we present a method based on deep neural network to solve PDE with inhomoge-
neous coefficient fields. Physical quantities of interest are learned as a function of the coefficient
field. Based on the time-evolution technique for solving PDE, we provide theoretical motiva-
tion to represent these quantities using an NN. The numerical experiments on diffusion equation
and NLSE show the effectiveness of simple convolutional neural network in parametrising such
function to 10−3 accuracy. We remark that while many questions should be asked, such as what
is the best network architecture and what situations can this approach handle, the goal of this
short note is simply to suggest neural network as a promising tool for model reduction when
solving PDE with uncertainties.

Conflict of interest
None.

References
[1] ABADI, M., AGARWAL, A., BARHAM, P., BREVDO, E., CHEN, Z., CITRO, C., CORRADO, G. S.,
DAVIS, A., DEAN, J., DEVIN, M., GHEMAWAT, S., GOODFELLOW, I., HARP, A., IRVING, G.,
ISARD, M., JIA, Y., JOZEFOWICZ, R., KAISER, L., KUDLUR, J., LEVENBERG, M., MANE, D.,
MONGA, R., MOORE, S., MURRAY, D., OLAH, C., SCHUSTER, M., SHLENS, J., STEINER, B.,
SUTSKEVER, I., TALWAR, K., TUCKER, P., VANHOUCKE, V., VASUDEVAN, V., VIEGAS, F.,
VINYALS, O., WARDEN, P., WATTENBERG, M., WICKE, M., YU, Y. & ZHENG, X. (2016)
Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint
arXiv:1603.04467.
[2] CARLEO, G. & TROYER, M. (2017) Solving the quantum many-body problem with artificial neural
networks. Science 355(6325), 602–606.
[3] CHENG, M., HOU, T. Y., YAN, M. & ZHANG, Z. (2013) A data-driven stochastic method for elliptic
PDEs with random coefficients. SIAM/ASA J. Uncertainty Quant. 1(1), 452–493.
[4] Chollet, F. (2017) Keras (2015). http://keras.io.
[5] Dozat, T. (2016) Incorporating Nesterov momentum into ADAM. In: Proceedings of the ICLR
Workshop.
[6] HAN, J., JENTZEN, A. & WEINAN E. (2017) Overcoming the curse of dimensionality: solving high-
dimensional partial differential equations using deep learning. arXiv preprint arXiv:1707.02568.
[7] HE, K., ZHANG, X., REN, S. & SUN, J. (2016) Deep residual learning for image recognition. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778.
[8] HINTON, G. E. & SALAKHUTDINOV, R. R. (2006) Reducing the dimensionality of data with neural
networks. Science 313(5786), 504–507.
[9] KHOO, Y., LU, J. & YING, L. (2018) Solving for high dimensional committor functions using artificial
neural networks. arXiv preprint arXiv:1802.10275.
[10] LAGARIS, I. E., LIKAS, A. & FOTIADIS, D. I. (1998) Artificial neural networks for solving ordinary
and partial differential equations. IEEE Trans. Neural Networks 9(5), 987–1000.
[11] LECUN, Y., BENGIO, Y. & HINTON, G. (2015) Deep learning. Nature 521(7553), 436–444.
[12] LONG, Z., LU, Y., MA, X. & DONG, B. (2017) PDE-net: learning PDEs from data. arXiv preprint
arXiv:1710.09668.

[13] MATTHIES, H. G. & KEESE, A. (2005) Galerkin methods for linear and nonlinear elliptic stochastic
partial differential equations. Comput. Methods Appl. Mecha. Eng. 194(12), 1295–1331.
[14] RUDD, K. & FERRARI, S. (2015) A constrained integration (CINT) approach to solving partial
differential equations using artificial neural networks. Neurocomputing 155, 277–285.
[15] SCHMIDHUBER, J. (2015) Deep learning in neural networks: an overview. Neural Networks 61, 85–
117.
[16] STEFANOU, G. (2009) The stochastic finite element method: past, present and future. Comput.
Methods Appl. Mecha. Eng. 198(9), 1031–1051.
[17] TORLAI, G. & MELKO, R. G. (2016) Learning thermodynamics with Boltzmann machines. Phys. Rev.
B 94(16), 165134.
[18] WIENER, N. (1938) The homogeneous chaos. Am. J. Math. 60(4), 897–936.
[19] XIU, D. & KARNIADAKIS, G. E. (2002) The Wiener–Askey polynomial chaos for stochastic
differential equations. SIAM J. Sci. Comput. 24(2), 619–644.
[20] XU, J. & ZIKATANOV, L. (2017) Algebraic multigrid methods. Acta Numerica 26, 591–721.

A Appendix

A.1 Proof of representability of effective conductance by NN

As mentioned previously in Section 3, the first step of constructing an NN to represent the effec-
tive conductance is to perform time-evolution iterations in the form of (3.4). However, since
at each step we need to approximate the map from um to um+1 in (3.4) using NN, the pro-
cess of time-evolution is similar to applying noisy gradient descent on E(u; a). More precisely,
after performing a step of gradient descent update, the NN approximation incurs noise to the
update, i.e.

v 0 = u0 = 0, um+1 = v m − t∇E(v m ), v m+1 = um+1 + tεm+1 . (A.1)

Here E(u; a) is abbreviated as E(u), and εm+1 is the error for each layer of the NN in
approximating each exact time-evolution iteration um+1 .
Now let the spectral norm of La and L†a satisfy

La 2 ≤ λa , L†a 2 ≤ 1/μa , (A.2)

and for the case considered, λa = O(λ1 n2 ) and μa = (λ0 ). Assuming

εm+1 2 ≤ c∇E(v m )2 , 1 εm+1 = 0, m = 0, . . . M − 1, (A.3)

the following lemma can be obtained.

Lemma 1 The iterations in (3.4) satisfy

t
E(v m+1 ) − E(v m ) ≤ − ∇E(v m )22 , (A.4)
2
c2
if t ≤ δ, δ = 1 − 1 2
2(1−c) λa
, λa = (1 + )λ .
1−c a
Furthermore,

t
M−1
∇E(v m+1 )22 ≤ E(v 0 ) − E(v M ) ≤ E(v 0 ). (A.5)
2 m=0

Proof From Lipschitz property of ∇E(u) (A.2),

λa m+1
E(v m+1 ) − E(v m ) ≤ ∇E(v m ), v m+1 − v m + v − v m 22
2
= ∇E(v m ), v m − t(∇E(v m ) + εm+1 ) − v m
λa
+ v m − t(∇E(v m ) + εm+1 ) − v m 22
2
tλa
= − t(1 − )∇E(v m )22
2
tλa m+1 λa t2 m 2
+ t(1 − )ε , ∇E(v m ) + ε 2
2 2
tλa
≤ − t(1 − )∇E(v m )22
2
tλa ctλa
+ ct(1 − + )∇E(v m )22
2 2
tλa
= − t (1 − c) − (1 − c + c2 ) ∇E(v m )22
2
1 − c + c2 tλa
= − t(1 − c) 1 − ∇E(v m )22
1−c 2
tλa
= − t(1 − c) 1 − ∇E(v m )22 . (A.6)
2

Letting t ≤ 1 − 1 2
2(1−c) λa
, we get

t
E(v m+1 ) − E(v m ) ≤ − ∇E(v m )22 . (A.7)
2
Summing the LHS and RHS and using the fact that E(u) ≥ 0 give (A.5). This concludes the
lemma.

Theorem 2 If t satisfies the condition in Lemma 1, given any > 0, |E(v M ) − E(v)| ≤ for
λ21 2
M = O((λ21 + λ0
+ λ1 ) n ).

Proof Since by convexity

E(u∗ ) − E(v m ) ≥ ∇E(v m ), u∗ − v m , (A.8)

along with Lemma 1,

t
E(v m+1 ) ≤ E(u∗ ) + ∇E(v m ), v m − u∗ − ∇E(v m )22
2
1
= E(u∗ ) + 2t∇E(v m ), v m − u∗ − t2 ∇E(v m )22
2t
+ v m − u∗ 22 − v m − u∗ 22
1
= E(u∗ ) + (v m − u∗ 22 − v m − t∇E(v m ) − u∗ 22 )
2t
1
= E(u∗ ) + (v m − u∗ 22 − v m+1 − tεm+1 − u∗ 22 )
2t
1
= E(u∗ ) + (v m − u∗ 22 − v m+1 − u∗ 22
2t
+ 2tεm+1 , v m+1 − u∗ − t2 εm+1 22 )
1 m
= E(u∗ ) + v − u∗ 22 − v m+1 − u∗ 22 + t2 εm+1 22
2t

Downloaded from https://www.cambridge.org/core. Caltech Library, on 04 Jul 2020 at 06:12:05, subject to the Cambridge Core terms of use, available at
https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0956792520000182
Neural networks for parametric PDE 15

+ 2tεm+1 , v m − u∗ − 2tεm+1 , ∇E(v m )
1 m
≤ E(u∗ ) + v − u∗ 22 − v m+1 − u∗ 22 + t2 εm+1 22
2t
+ 2tεm+1 2 (v m − u∗ 2 + ∇E(v m )2 )
1 m
≤ E(u∗ ) + v − u∗ 22 − v m+1 − u∗ 22 + t2 εm+1 22
2t
+ 2t(1 + 2/μa )εm+1 2 ∇E(v m )2
1 m
≤ E(u∗ ) + v − u∗ 22 − v m+1 − u∗ 22 + c2 t2 ∇E(v m )22
2t
+ 2c(1 + 2/μa )t∇E(v m )22 . (A.9)

The last second inequality follows from (A.2), which implies La u2 ≥ μa u2 if u 1 = 0. More
precisely, the fact that v 0 = 0, ∇E(u) 1 = 0 (follows from the form of La and bξ ,a defined in
(2.5)), and εm 1 = 0 ∀m (due to the assumption in (A.3)) implies v m 1 = 0, hence μ2a v m −
u∗ 2 ≤ ∇E(v m ) − ∇E(u∗ )2 = ∇E(v m )2 . Reorganising (A.9) we get

1 2
E(v m+1 ) − E(u∗ ) ≤ v m − u∗ 22 − v m+1 − u∗ 22 + ct ct + 2(1 + ) ∇E(v m )22 .
2t μa
(A.10)
Summing both left- and right-hand sides results in

1
M−1
∗
E(v ) − E(u ) ≤
M
E(v m+1 ) − E(u∗ )
M m=0

0
1 v − u∗ 22 2c 2
≤ + ct + 2(1 + ) E(v ) ,
0
(A.11)
M 2t t μa
where the second inequality follows from (A.5). In order to derive a bound for v 0 − u∗ 22 , we
appeal to strong convexity property of E(u):
μa 0 μa 0
E(v 0 ) − E(u∗ ) ≥ ∇E(u∗ ), v m − u∗ + v − u∗ 22 = v − u∗ 22 (A.12)
2 2
for 1, v 0 − u∗ = 0. Then

1 ∗ 1 2c 2
E(v ) − E(u ) ≤
M
+ ct + 2(1 + ) E(v 0 ). (A.13)
M μa t t μa
a 1
Since E(v 0 ) = 2nd
= O(λ1 ), along with λa = O(λ1 n2 ), μa = (λ0 ), we establish the claim.

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6454)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (643)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brene Brown
4/5 (1175)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (1005)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (650)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4102)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (628)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (1022)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (582)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (5181)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Tóibín
3.5/5 (2133)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (280)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (464)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4372)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (2016)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1090)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (2033)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2791)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2884)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (141)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4088)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (78)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
Mth166:Differential Equations and Vector Calculus: Session 2021-22 Page:1/1
No ratings yet
Mth166:Differential Equations and Vector Calculus: Session 2021-22 Page:1/1
1 page
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
4/5 (278)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
Basic Concepts in Population Modeling
No ratings yet
Basic Concepts in Population Modeling
16 pages
Ode Lecturer With Full PDF Notes With Complete Explanation For University Student
No ratings yet
Ode Lecturer With Full PDF Notes With Complete Explanation For University Student
28 pages
Civil Syllabus BMSCE
No ratings yet
Civil Syllabus BMSCE
46 pages
Differential Equations-Jan 2022
No ratings yet
Differential Equations-Jan 2022
76 pages
1.3 Mathematical Modeling:: The Typical Steps of Modeling Can Be Described With The Aid of The Following Diagram
No ratings yet
1.3 Mathematical Modeling:: The Typical Steps of Modeling Can Be Described With The Aid of The Following Diagram
6 pages
Numerical Methods in The Hydrological Sciences
100% (1)
Numerical Methods in The Hydrological Sciences
258 pages
V. V Bolotin - The Dynamic Stability of Elastic Systems - Holden-Day (1964) PDF
No ratings yet
V. V Bolotin - The Dynamic Stability of Elastic Systems - Holden-Day (1964) PDF
455 pages
ACFrOgC-gXOVXRyu1XShHTQDd EZT42SEMspy8u4Blxy3AO-7mo2M9bK9WjxcUS3Uh90 FNuGjeyVQzeRUeotjoxFnvYHABD6 uEg20AD0piruC UJkazt SsrkZ7HSvX - PwD9GqhFDLaolIlyx
No ratings yet
ACFrOgC-gXOVXRyu1XShHTQDd EZT42SEMspy8u4Blxy3AO-7mo2M9bK9WjxcUS3Uh90 FNuGjeyVQzeRUeotjoxFnvYHABD6 uEg20AD0piruC UJkazt SsrkZ7HSvX - PwD9GqhFDLaolIlyx
8 pages
Lecture Notes 11-Initial Value Problem ODE
100% (1)
Lecture Notes 11-Initial Value Problem ODE
51 pages
Keith D. Hjelmstad - Fundamentals of Structural Dynamics_ Theory and Computation-Springer (2022)
No ratings yet
Keith D. Hjelmstad - Fundamentals of Structural Dynamics_ Theory and Computation-Springer (2022)
556 pages
Syllabus: Mathematics - I: Total Lectures: 37 Functions of One Variable (5 Lectures)
No ratings yet
Syllabus: Mathematics - I: Total Lectures: 37 Functions of One Variable (5 Lectures)
2 pages
BCA Maths - 4 Sem (Differential Equation) Syllabus
No ratings yet
BCA Maths - 4 Sem (Differential Equation) Syllabus
1 page
1 s2.0 S0096300314017822 Main
No ratings yet
1 s2.0 S0096300314017822 Main
15 pages
CSE Syllabus Comilla University 1011
No ratings yet
CSE Syllabus Comilla University 1011
26 pages
57 MATHEMATICS MAJOR B.Sc. Sem II
No ratings yet
57 MATHEMATICS MAJOR B.Sc. Sem II
4 pages
Calculus: Gilbert Strang & Edwin Herman
100% (4)
Calculus: Gilbert Strang & Edwin Herman
2,307 pages
Assignment 4. 0 - First Order Linear Equation and Bernoulli - MATH 021-CE21S1 - Differential Equations
No ratings yet
Assignment 4. 0 - First Order Linear Equation and Bernoulli - MATH 021-CE21S1 - Differential Equations
9 pages
Introduction To DE
No ratings yet
Introduction To DE
11 pages
MT 222
No ratings yet
MT 222
2 pages
PSNM - Ch. 6
No ratings yet
PSNM - Ch. 6
30 pages
3 Sem Mdu Syllabus
No ratings yet
3 Sem Mdu Syllabus
21 pages
EEE 5th Semester - HVDC and FACTS - EE3004 2021 Regulation - Question Paper 2023 Nov Dec
No ratings yet
EEE 5th Semester - HVDC and FACTS - EE3004 2021 Regulation - Question Paper 2023 Nov Dec
8 pages
MA201 Lecture8 Handout
No ratings yet
MA201 Lecture8 Handout
24 pages
Weighted Residual Methods
No ratings yet
Weighted Residual Methods
17 pages
Mathematics-Optional: by Venkanna Sir and Satya Sir PDE 2013-2019
No ratings yet
Mathematics-Optional: by Venkanna Sir and Satya Sir PDE 2013-2019
6 pages
Bernoulli Differential Equation
No ratings yet
Bernoulli Differential Equation
9 pages
Mathematics-Msc-2020 Allahabad University
No ratings yet
Mathematics-Msc-2020 Allahabad University
41 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Vimp Paper

Uploaded by

Vimp Paper

Uploaded by

Euro.

Jnl of Applied Mathematics: page 1 of 15

Solving parametric PDE problems with

(Received 30 August 2019; revised 21 April 2020; accepted 27 May 2020)

The curse of dimensionality is commonly encountered in numerical partial differential equations

Key words: Neural-network, parametric PDE, uncertainty quantification

2020 Mathematics Subject Classification: 65Nxx

f (a) ≈ hθ (a), a ∈ Rn . (1.1)

• We provide theoretical guarantees on neural network representation of f (a) through explicit

2 Two examples of parametric PDE problems

2.1 Effective coefficients for inhomogeneous elliptic equation

∇ · a(x)(∇u(x) + ξ ) = 0, x ∈ [0, 1]d (2.1)

with periodic boundary condition where ξ ∈ Rd , ξ 22 = 1 ( · 2 is the Euclidean norm). To

A = {a ∈ L∞ ([0, 1]d ) | λ1 ≥ a ≥ λ0 > 0}. (2.2)

2.2 NLSE with random potential

−u(x) + a(x)u(x) + σ u(x)3 = E0 u(x), x ∈ [0, 1]d (2.7)

subject to the normalisation constraint

where the diag(·) operator forms a diagonal matrix given a vector.

3 Theoretical justification of deep neural network representation

|hθ (a) − Aeff (a)| ≤ λ1 . (3.1)

Aeff (a) = 2 min E(u; a), (3.3)

U0im1 ...id = a(i1 ,...,id ) , U1im1 ...id = um

4 Proposed network architecture and numerical results

FIGURE 2. The map between (ut , a) and ut+1 .

FIGURE 3. The map between (uM , a) and E.

(i+τ1 )(j+τ2 ) ; a(i+τ1 )(j+τ2 ) ) = E(u ; a)

f (a) := E(ua ; a) (4.2)

φ̃kij (aτ1 τ2 ) = φ̃k(i−τ1 )(j−τ2 ) (a), ∀(τ1 , τ2 ) ∈ {1, . . . , n}2 . (4.3)

When using the architecture in Figure 4, for any τ1 τ2 ,

Data Training Validation Average No. of No. of

1 8 10 2.0e-3 1.8e-3 1.58 ± 0.10 1.2e + 4 4416

4.3 Numerical examples

1. The independent and identical random variables ai , i = 1, . . . , nd are distributed according

Data Training Validation Average No. of No. of

1 8 16 1.5e-3 1.4e-3 1.58 ± 0.10 1.2e + 4 1057

Training Validation No. of No. of

8 5 4.9 × 10−4 5.0 × 10−4 10.48 ± 0.51 4800 331

4.3.2 Ground state energy of NLSE

A.1 Proof of representability of effective conductance by NN

v 0 = u0 = 0, um+1 = v m − t∇E(v m ), v m+1 = um+1 + tεm+1 . (A.1)

La 2 ≤ λa , L†a 2 ≤ 1/μa , (A.2)

and for the case considered, λa = O(λ1 n2 ) and μa = (λ0 ). Assuming

εm+1 2 ≤ c∇E(v m )2 , 1 εm+1 = 0, m = 0, . . . M − 1, (A.3)

the following lemma can be obtained.

Lemma 1 The iterations in (3.4) satisfy

Proof From Lipschitz property of ∇E(u) (A.2),

Proof Since by convexity

E(u∗ ) − E(v m ) ≥ ∇E(v m ), u∗ − v m , (A.8)

along with Lemma 1,

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

with periodic boundary condition where ξ ∈ Rd , ξ 22 = 1 ( · 2 is the Euclidean norm). To

−u(x) + a(x)u(x) + σ u(x)3 = E0 u(x), x ∈ [0, 1]d (2.7)

v 0 = u0 = 0, um+1 = v m − t∇E(v m ), v m+1 = um+1 + tεm+1 . (A.1)

La 2 ≤ λa , L†a 2 ≤ 1/μa , (A.2)

and for the case considered, λa = O(λ1 n2 ) and μa = (λ0 ). Assuming

εm+1 2 ≤ c∇E(v m )2 , 1 εm+1 = 0, m = 0, . . . M − 1, (A.3)

E(u∗ ) − E(v m ) ≥ ∇E(v m ), u∗ − v m , (A.8)