Partial Differential Equations in Applied Mathematics: Shawn Koohy, Guangming Yao, Kalani Rubasinghe

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Partial Differential Equations in Applied Mathematics 7 (2023) 100499

Contents lists available at ScienceDirect

Partial Differential Equations in Applied Mathematics


journal homepage: www.elsevier.com/locate/padiff

Numerical solutions to low and high-dimensional Allen–Cahn equations


using stochastic differential equations and neural networks✩
Shawn Koohy a , Guangming Yao b ,∗, Kalani Rubasinghe b
a Department of Mathematics, University of Massachusetts Dartmouth, North Dartmouth, 02747, MA, USA
b
Department of Mathematics, Clarkson University, 8 Clarkson Ave, Potsdam, 13699, NY, USA

ARTICLE INFO ABSTRACT


Keywords: In this paper, we focus on solving semilinear parabolic differential equations in low and high-dimensional
Stochastic differential equations spaces by using backward stochastic differential equations and deep neural networks (the BSDE solver
Neural networks introduced by Han et al. in 2017). A 5D test problem was created to test the accuracy of the algorithm
Allen–Cahn equations
and to understand the key parameters in the neural network. In addition, we focus on Allen–Cahn equations
Potential functions
in 1D, 3D, and 60D with different potential functions, including higher-order polynomial potential functions.
Semilinear PDEs
Euler’s method
To the best of our knowledge, this is the first time that Allen–Cahn equations were investigated in low and
high dimensional spaces using the same algorithm. In addition, double well potential functions and higher
order potential functions are also investigated using the same algorithm. Some patterns are observed through
simulations with regard to the relations between the solutions and the order of potential functions and spatial
dimensions.

1. Introduction of diffusion–reaction equations, or diffusion–convection–reaction equa-


tions. These systems can be very large, where numerical techniques
The curse of dimensionality is a well-known issue spread throughout are the key to solutions to such problems4,5 . Typical semilinear PDEs
problems in mathematics, physics, data science, or any situation that in low-dimensional spaces can be solved by the multilevel Picard
involves the organization, collection, and analysis of high-dimensional method6 , finite element methods7 , meshfree methods8 , finite difference
spaces1–3 . This issue becomes very apparent in the attempt to solve method8 and others9–12 . In13 a combination of the kernel method
widely applied and investigated second-order semilinear parabolic par- and finite difference method was used in order to find numerical
tial differential equations (PDEs) in high-dimensional spaces. In this approximations of the Allen–Cahn equation. The radial basis function
paper, we focus on such PDEs in the following form subject to a (RBF) method has also seen light towards solutions of parabolic PDEs
terminal condition given by and second-order PDEs in14–16 , especially in higher-dimensional spaces
but those are still limited to three-dimensional spaces in general.
⎧ 𝜕𝑢(𝑡, 𝑥) 1 ( )
⎪ + Tr 𝜎𝜎 𝑇 (𝑡, 𝑥)H𝑥 𝑢(𝑡, 𝑥) + ∇𝑢(𝑡, 𝑥) ⋅ 𝜇(𝑡, 𝑥) The backward stochastic differential equation (BSDE) solver in-
⎪ 𝜕𝑡 2 troduced in17–19 solves higher-order semilinear parabolic PDEs. The
⎨ +𝑓 (𝑡, 𝑥, 𝑢, 𝜎 𝑇 ∇𝑢) = 0, (1.1)
⎪ reason why we begin to work with BSDEs is due to the curse of
⎪ 𝑢(𝑇 , 𝑥) = 𝑔(𝑥), dimensionality. When it comes to analyzing data and working with

spatial regions, expanding towards higher dimensions exponentially
where 𝑥 ∈ R𝑑 , 𝑡 ∈ R, Tr denotes the trace of a matrix, 𝜎(𝑡, 𝑥) is a increases the computational time and complexity for the solution of
known 𝑑 ×𝑑 matrix-valued function, H𝑥 𝑢(𝑡, 𝑥) is the Hessian matrix with a problem. References such as20,21 and various other papers simulated
respect to the spatial variable 𝑥, ∇𝑢 is the gradient of 𝑢 with respect to and solved stochastic differential equations (SDEs) successfully even for
𝑥, 𝜇(𝑡, 𝑥) is a known vector-valued function, and 𝑓 is a known nonlinear high-dimensional cases.
function. In recent years the use of neural networks has been shown to greatly
Semilinear parabolic PDEs describe numerous physical, chemical, improve the computational ability towards breaking and solving this
biological, and financial phenomena. Famous PDEs such as the Allen– curse of dimensionality22–24 . In25 , the authors were able to provide
Cahn equations, the Black Scholes equations, the Hamilton–Jacobi– proof of how neural networks are able to overcome the curse of dimen-
Bellman equations, and many other equations are usually in the forms sionality for solving the Black–Scholes equation. Many authors further

✩ This work was supported by NSA, USA H89230-22-1-0008: Summer Research Experience for Undergraduates in Math.
∗ Corresponding author.
E-mail address: gyao@clarkson.edu (G. Yao).

https://doi.org/10.1016/j.padiff.2023.100499
Received 15 November 2022; Received in revised form 7 February 2023; Accepted 7 February 2023

2666-8181/© 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
S. Koohy, G. Yao and K. Rubasinghe Partial Differential Equations in Applied Mathematics 7 (2023) 100499

showed the abilities that neural networks possess towards breaking this We will next review differentiation under Brownian motion, which
curse for data analysis3,26,27 and for solving differential equations28–30 . is the Itô’s formula. Our goal is to show that the solution to an SDE
Our goal in this paper is to find an approximation of 𝑢(𝑡, 𝑥) for can help us find the solution 𝑢 to (1.1). In the case of (1.1), the Itô’s
various 𝑡 and 𝑥 using machine learning and neural networks for semi- formula takes the following form:
linear parabolic PDEs in low and high-dimensional spaces, including
traditional one- to three-dimensional spaces, and even a test five- Theorem 2.1 (Itô’s Formula33 ). Suppose that 𝑋(𝑡) has a stochastic
dimensional spaces PDE and the Allen–Cahn equations in low and high differential
dimensional spaces including sixty-dimensional spaces. With the rela-
𝑑𝑋𝑡 = 𝜇𝑑𝑡 + 𝜎𝑑𝑊𝑡 ,
tively new re-emergence of machine learning and its grand amount of
usefulness along with versatility, combining SDEs and neural networks For 𝜇 ∈ 𝐿1 , 𝜎 ∈ 𝐿2 . Assume 𝑢(𝑡, 𝑥) ∶ 𝐑 × [𝑜, 𝑇 ] → 𝐑 is a continuous and
2
gives us a great approach to solving low and high-dimensional PDEs31 . that 𝜕𝑢 , 𝜕𝑢 , 𝜕𝑢 exist and are continuous. Then 𝑌𝑡 = 𝑢(𝑡, 𝑋𝑡 ) is again an Itô
𝜕𝑡 𝜕𝑥 𝜕𝑥2
In Section 2, we reviewed Brownian motion and Ito’s Lemma followed a process and has the stochastic differential form:
proof of solutions to the BSDE is equivalent to a solution to the original
PDEs. Before we use BSDE for problems with unknown analytical 𝑑𝑌𝑡 = 𝑑𝑢(𝑡, 𝑋𝑡 )
solutions, we examined the BSDE solver algorithm on a benchmark 𝜕𝑢 𝜕𝑢 1 𝜕2 𝑢
= 𝑑𝑡 + 𝑑𝑋𝑡 + ⋅ (𝑑𝑋𝑡 )2
diffusion–reaction equation in 5D. A 5D test problem was created to 𝜕𝑡 𝜕𝑥 2 𝜕𝑥2
test the accuracy of the algorithm and to understand key parameters 𝜕𝑢 𝜕𝑢 1 𝜕2 𝑢 2
= 𝑑𝑡 + 𝑑𝑋𝑡 + 𝜎 𝑑𝑡
in the neural network, such as the number of layers, learning rate, 𝜕𝑡 𝜕𝑥 2 𝜕𝑥2
initial guess, and activation functions. We discovered the algorithm 𝜕𝑢 𝜕𝑢 𝜕𝑢 1 𝜕2 𝑢 2
= 𝑑𝑡 + 𝜇𝑑𝑡 + + 𝜎𝑑𝑊𝑡 + 𝜎 𝑑𝑡.
is flexible with respect to the initial guesses. It converges to desired 𝜕𝑡 𝜕𝑥 𝜕𝑥 2 𝜕𝑥2
solutions fairly quickly and the learning rate adjustment can improve □
the algorithm’s efficiency. We analyzed how the network’s solution be-
haves at different final times, along with using different initial starting Theorem 2.2 (Multidimensional Itô’s Formula33 ). For a vector 𝜇 and
points, activation functions, and heavy learning rate manipulation in matrix 𝜎, let 𝑋𝑡 = (𝑋𝑡1 , 𝑋𝑡2 , … , 𝑋𝑡𝑑 )𝑇 be a vector of Itô processes such that
Section 4. Section 5 will be dealing with various Allen-Chan equations
𝑑𝑋𝑡 = 𝜇 𝑑𝑡 + 𝜎 𝑑𝑊𝑡 .
in low and high-dimensional spaces with various orders of potential
functions. With these equations, we present their solutions at different Then
points in time and space. Specifically for the Allen–Cahn equation, we 𝜕𝑢 1( )𝑇 ( )
𝑑𝑢(𝑡, 𝑋𝑡 ) = 𝑑𝑡 + (∇𝑢)𝑇 𝑑𝑋𝑡 + 𝑑𝑋𝑡 𝐇𝑥 𝑢 𝑑𝑋𝑡 ,
explore solutions using varying parameters, the interaction length 𝜖, 𝜕𝑡 2
{ )]} (2.3)
and the order of potential functions 𝛼, in 1D, 3D, and 60D. A short 𝜕𝑢 1 [ (
= + (∇𝑢)𝑇 𝜇 + Tr 𝜎 𝑇 𝜎 𝐇𝑥 𝑢 𝑑𝑡 + (∇𝑢)𝑇 𝜎 𝑑𝑊𝑡
conclusion is drawn in Section 6. 𝜕𝑡 2
where ∇𝑢 is the gradient of 𝑢 w.r.t. 𝑥, 𝐇𝑥 𝑢 is the Hessian matrix of 𝑢 w.r.t.
2. Brownian motion and SDE 𝑥, and 𝑇 𝑟 is the trace operator. □

A stochastic process is a collection of random variables indexed by Theorem 2.3 (31 ). The semilinear parabolic differential Eq. (1.1) has
time. It can be provided in two ways, through discrete or continuous a solution 𝑢(𝑡, 𝑥) if and only if 𝑢(𝑡, 𝑥) satisfies the following backward
time giving us {𝑋0 , 𝑋1 , … , 𝑋𝑡 } or {𝑋𝑡 }𝑡≥0 , respectively. As an alternate stochastic differential equation (BSDE)
definition, a stochastic process can be thought of as a probability 𝑢(𝑡, 𝑋𝑡 ) − 𝑢(0, 𝑋0 )
distribution over a space of paths. 𝑡 ( )
We now introduce what is known as Brownian motion, sometimes =− 𝑓 𝑠, 𝑋𝑠 , 𝑢(𝑠, 𝑋𝑠 ), 𝜎 𝑇 (𝑠, 𝑋𝑠 )∇𝑢(𝑠, 𝑋𝑠 ) 𝑑𝑠
∫0 □ (2.4)
referred to as a Wiener process. We denote Brownian motion by 𝑊𝑡 . 𝑡
𝑇
+ [∇𝑢(𝑠, 𝑋𝑠 )] 𝜎(𝑠, 𝑋𝑠 )𝑑𝑊𝑠 .
∫0
Definition 2.1 (32 ). A real-valued stochastic process 𝑊 (⋅) is called a
Brownian motion (or Wiener process) if
Proof. For our simplicity, we rewrite (2.4) as follows:
(i) 𝑊 (0) = 0 a.s.,c 𝑡 𝑡
(ii) 𝑊 (𝑡) − 𝑊 (𝑠) is 𝑁(0, 𝑡 − 𝑠) for all 𝑡 ≥ 𝑠 ≥ 0, 𝑢(𝑡, 𝑋𝑡 ) − 𝑢(0, 𝑋0 ) = − 𝑓 𝑑𝑠 + [∇𝑢(𝑠, 𝑋𝑠 )]𝑇 𝜎(𝑠, 𝑋𝑠 )𝑑𝑊𝑠 . (2.5)
∫0 ∫0
(iii) for all times 0 < 𝑡1 < 𝑡2 < ⋯ < 𝑡𝑛 , the random vari-
ables 𝑊 (𝑡1 ), 𝑊 (𝑡2 ) − 𝑊 (𝑡1 ), … , 𝑊 (𝑡𝑛 ) − 𝑊 (𝑡𝑛−1 ) are independent Let
(‘‘independent increments’’). 𝑇 𝑇
𝑌𝑡 = 𝑔(𝑋𝑇 ) + 𝑓 (𝑠, 𝑋𝑠 , 𝑌𝑠 , 𝑍𝑠 )𝑑𝑠 − (𝑍𝑡 )𝑇 𝑑𝑊𝑠 (2.6)
∫𝑡 ∫𝑡
Note that 𝑁(𝜇, 𝜎 2 ) represents a normal distribution with mean 𝜇 and
standard deviation 𝜎. 𝑍𝑡 = 𝜎 𝑇 (𝑡, 𝑋𝑡 )∇𝑢(𝑡, 𝑋𝑡 ). (2.7)

Then
Definition 2.2. For a vector 𝜇 ∈ 𝐿1 and a matrix 𝜎 ∈ 𝐿2 , 𝑋(𝑡) is an
Itô’s process if 𝑑𝑌𝑡 = −𝑓 (𝑡, 𝑋𝑡 , 𝑌𝑡 , 𝑍𝑡 )𝑑𝑡 + (𝑍𝑠 )𝑇 𝑑𝑊𝑡 (2.8)
𝑡 𝑡
𝑋𝑡 = 𝑋0 + 𝜇𝑑𝑠 + 𝜎𝑑𝑊 . (2.1) In addition, by Itô’s formula, we have that
∫0 ∫0 { }
1
𝑑(𝑢(𝑡, 𝑋𝑡 )) = 𝑢𝑡 + ∇𝑢 ⋅ 𝜇 + 𝑇 𝑟(𝜎𝜎 𝑇 𝐇𝑥 𝑢) 𝑑𝑡 + [∇𝑢]𝑇 𝜎𝑑𝑊 . (2.9)
We say that 𝑋(𝑡) has a stochastic differential form 2

𝑑𝑋𝑡 = 𝜇𝑑𝑡 + 𝜎𝑑𝑊𝑡 (2.2) • If 𝑢(𝑡, 𝑥) is a solution to the semilinear parabolic (1.1),
1
for 0 ≤ 𝑡 ≤ 𝑇 . 𝑢𝑡 + ∇𝑢 ⋅ 𝜇 + 𝑇 𝑟(𝜎𝜎 𝑇 𝐇𝑥 𝑢) = −𝑓 (𝑡, 𝑋𝑡 , 𝑌𝑡 , 𝑍𝑡 ).
2
Thus,
c
A property which is true except for an event of probability zero is said to
hold almost surely (usually abbreviated ‘‘a.s’’). 𝑑(𝑢(𝑡, 𝑋𝑡 )) = − 𝑓 (𝑡, 𝑋𝑡 , 𝑌𝑡 , 𝑍𝑡 )𝑑𝑡 + [∇𝑢]𝑇 𝜎𝑑𝑊𝑡 . (2.10)

2
S. Koohy, G. Yao and K. Rubasinghe Partial Differential Equations in Applied Mathematics 7 (2023) 100499

Note the definitions of 𝑌𝑡 and 𝑍𝑡 , This is a time-discretized version of (2.4). By using this equation,
𝑇 when given the terminal condition, we can propagate in time provided
𝑑(𝑢(𝑡, 𝑋𝑡 )) = −𝑓 (𝑡, 𝑋𝑡 , 𝑌𝑡 , 𝑍𝑡 )𝑑𝑡 + (𝑍𝑡 ) 𝑑𝑊𝑡 = 𝑑𝑌𝑡 , (2.11)
that ∇𝑢(𝑡𝑛 , 𝑋𝑡𝑛 ) can be approximated numerically in each time step.
Thus, Approximation to ∇𝑢(𝑡𝑛 , 𝑋𝑡𝑛 ) at each time step is done by approxi-
mating the function
𝑇 𝑇
𝑢(𝑡, 𝑋𝑡 ) = 𝑌𝑡 = 𝑔(𝑋𝑇 ) + 𝑓 (𝑠, 𝑋𝑠 , 𝑌𝑠 , 𝑍𝑠 )𝑑𝑠 − (𝑍𝑡 )𝑇 𝑑𝑊𝑠 , 𝑥 → 𝜎 𝑇 (𝑡, 𝑥)∇𝑢(𝑡, 𝑥) (3.3)
∫𝑡 ∫𝑡
𝑇 𝑇
𝑢(0, 𝑋0 ) = 𝑌0 = 𝑔(𝑋𝑇 ) + 𝑓 (𝑠, 𝑋𝑠 , 𝑌𝑠 , 𝑍𝑠 )𝑑𝑠 − (𝑍𝑡 )𝑇 𝑑𝑊𝑠 . at each time step 𝑡 = 𝑡𝑛 by a multilayer feed-forward neural network:
∫0 ∫0
𝜎 𝑇 (𝑡𝑛 , 𝑋𝑛 )∇𝑢(𝑡𝑛 , 𝑋𝑡𝑛 ) = (𝜎 𝑇 ∇𝑢)(𝑡𝑛 , 𝑋𝑛 ) = (𝜎 𝑇 ∇𝑢)(𝑡𝑛 , 𝑋𝑡𝑛 |𝜃𝑛 ) (3.4)
(2.12)
for 𝑛 = 1, … , 𝑁 − 1, where 𝜃𝑛 represents parameters of the neural
Therefore,
network approximating 𝜎 𝑇 (𝑡, 𝑥)∇𝑢(𝑡, 𝑥) at 𝑡 = 𝑡𝑛 . We associate a sub-
𝑢(𝑡, 𝑋𝑡 ) − 𝑢(0, 𝑋0 ) network, at each time step 𝑡𝑛 . We stack all these sub-networks together
𝑡 to form a deep neural network as a whole. This network takes the
=− 𝑓 (𝑠, 𝑋𝑠 , 𝑢(𝑠, 𝑋𝑠 ), 𝜎 𝑇 (𝑠, 𝑋𝑠 )∇𝑢(𝑠, 𝑋𝑠 ))𝑑𝑠
∫0 paths {𝑋𝑡𝑛 }0≤𝑛≤𝑁 and {𝑊𝑡𝑛 }0≤𝑛≤𝑁 as the input data and gives the final
𝑡 output, denoted by 𝑢({𝑋̂ 𝑡𝑛 }0≤𝑛≤𝑁 , {𝑊𝑡𝑛 }0≤𝑛≤𝑁 ), as an approximation to
+ [∇𝑢(𝑠, 𝑋𝑠 )]𝑇 𝜎(𝑠, 𝑋𝑠 )𝑑𝑊𝑠 𝑢(𝑡𝑁 , 𝑋𝑡𝑁 ).
∫0
𝑡 𝑡 The error, which is the difference between the approximation and
=− 𝑓 𝑑𝑠 + (𝑍𝑡 )𝑇 𝑑𝑊𝑠 the given terminal condition 𝑔(𝑥) = 𝑢(𝑇 , 𝑥), defines the loss function as
∫0 ∫0
follows,
This is (2.4). [ ( ) ]
• If 𝑢(𝑡, 𝑋𝑡 ) is a solution of (2.4), then 𝑙(𝜃) = E |𝑔(𝑋𝑡𝑛 ) − 𝑢̂ {𝑋𝑡𝑛 }0≤𝑛≤𝑁 , {𝑊𝑡𝑛 }0≤𝑛≤𝑁 |2 (3.5)
𝑡 𝑡
𝑢(𝑡, 𝑋𝑡 ) =𝑢(0, 𝑋0 ) − 𝑓 𝑑𝑠 + (𝑍𝑡 )𝑇 𝑑𝑊𝑠 , where 𝜃 is the total set of parameters. We refer to18 for more de-
∫0 ∫0 (2.13) tails on the neural network, where the authors presented a neural
𝑑𝑢(𝑡, 𝑋𝑡 ) = − 𝑓 𝑑𝑡 + (𝑍𝑡 )𝑇 𝑑𝑊𝑡 . network architecture as shown in Fig. 1. The network employs 𝑁 −
Thus, by (2.6) we have that 1 fully-connected feed-forward neural network which consists of 4
𝑇 𝑇
layers, 1 output layer, and 𝑑 + 10 hidden units in each hidden layer.
𝑢(𝑇 , 𝑋𝑡 ) = 𝑢(0, 𝑋0 ) − 𝑓 𝑑𝑠 + (𝑍𝑡 )𝑇 𝑑𝑊𝑠 Through extensive experiments, we realized increasing the number of
∫0 ∫0 hidden layers did not improve our numerical accuracy much. Therefore,
= 𝑢(0, 𝑋0 ) + 𝑔(𝑋𝑇 ) − 𝑌0 = 𝑔(𝑋𝑇 ). (2.14) we continued to employ the same parameters used in the network
throughout the paper.
Thus, 𝑢(𝑇 , 𝑥) = 𝑔(𝑥). On the other hand, recall Itô’s lemma,
The BSDE solver used in this paper was collected from18 . It was
combining (2.9) and (2.13), we have that
modified and implemented using Python version 3.9.12, TensorFlow
{ } version 2.9.1, NumPy version 1.21.5, and various other generally use
1
𝑢𝑡 + ∇𝑢 ⋅ 𝜇 + 𝑇 𝑟(𝜎𝜎 𝑇 𝐇𝑥 𝑢) 𝑑𝑡 + [∇𝑢]𝑇 𝜎𝑑𝑊𝑡 = −𝑓 𝑑𝑡 + (𝑍𝑡 )𝑇 𝑑𝑊𝑡 ,
2 packages. The numerical experiments were performed on a PC with
1 an AMD Ryzen 2700X boost clocked at 4.0 GHz, an NVIDIA GeForce
𝑢𝑡 + ∇𝑢 ⋅ 𝜇 + 𝑇 𝑟(𝜎𝜎 𝑇 𝐇𝑥 𝑢) = −𝑓 ,
2 RTX 2060 Super, along with 16 GB of RAM at 1367 MHz. The focus
1
𝑢𝑡 + ∇𝑢 ⋅ 𝜇 + 𝑇 𝑟(𝜎𝜎 𝑇 𝐇𝑥 𝑢) + 𝑓 = 0. of the numerical experiment is on reaction–diffusion equations. The
2
first experiment was done on a 5D test problem, and the second on
(2.15) the Allen–Cahn equation using various parameters and dimensions.
Thus, we have a solution to (1.1), 𝑢(𝑡, 𝑥). □
4. 5D test problem
To approximate a solution to PDE in (1.1), especially in higher-
dimensional spaces, (2.4) made it possible for us to find a way of We investigate the following reaction–diffusion equation
computing values of 𝑢 at the terminal time 𝑇 at any spatial point, where { 𝜕𝑢(𝑡, 𝑥)
𝑢(0, 𝑋0 ) is a given initial condition. = 𝛥𝑢(𝑡, 𝑥) − .2𝑢 − 5𝑒−.2𝑡 , 𝑡>0
𝜕𝑡 (4.1)
𝑢(0, 𝑥) = (𝑥1 + ⋯ + 𝑥5 )∕2 = ‖𝑥‖ ∕2,
2 2 2
3. Numerical solution to the BSDE
for 𝑥 ∈ R5 and 𝑡 ∈ [0, 𝑇 ]. The analytical solution to (4.1) is given by
We solve the (2.4) numerically to compute an approximation for 1 2
𝑢(0, 𝑋0 ). Let 𝑢(0, 𝑋0 ) = 𝜃𝑢0 and ∇𝑢(0, 𝑋0 ) = 𝜃∇𝑢0 be parameters of the 𝑢(𝑡, 𝑥) = (𝑥 + ⋯ + 𝑥25 )𝑒−.2𝑡 . (4.2)
2 1
numerical procedure. First, we need a time discretization to propagate
In order to convert (4.1) into the form of (1.1) we consider a time
in time and then use a neural network to approximate derivatives in
reversal mapping 𝑡 ↦ 𝑇 − 𝑡 for 𝑇 > 0. This leads us to the following
spatial variables during each time step.
equation with a terminal condition
An explicit Euler’s method is used for time discretization. We apply
{ 𝜕𝑢
temporal discretization to (2.4) and partition the time interval [0, 𝑇 ] + 𝛥𝑢 + .2𝑢 + 5𝑒−.2(𝑇 −𝑡) = 0,
to 0 = 𝑡0 < 𝑡1 < ⋯ < 𝑡𝑁 = 𝑇 . Consider the Euler’s method for 𝜕𝑡 (4.3)
𝑢(𝑇 , 𝑥) = (𝑥21 + ⋯ + 𝑥25 )∕2 = ‖𝑥‖2 ∕2.
𝑛 = 1, … , 𝑁 − 1:

𝑋𝑡𝑛+1 − 𝑋𝑡𝑛 ≈ 𝜇(𝑡𝑛 , 𝑋𝑡𝑛 )𝛥𝑡𝑛 + 𝜎(𝑡𝑛 , 𝑋𝑡𝑛 )𝛥𝑊𝑛 . (3.1) This matches the semi-linear parabolic form of (1.1) with 𝜎 = 2𝐼5 ,
𝜇(𝑡, 𝑥) = 0, and 𝑓 (𝑡, 𝑢) = .2𝑢 + 5𝑒−.2(𝑇 −𝑡) , where 𝐼5 denotes a 5 × 5
where 𝛥𝑡𝑛 = 𝑡𝑛+1 − 𝑡𝑛 and 𝛥𝑊𝑛 = 𝑊𝑡𝑛+1 − 𝑊𝑡𝑛 . Substituting (3.1) into identity matrix. Table 1 shows the network parameters used unless
(2.4) gives us stated otherwise. Initial starting range shows where the initial guess
𝑢(𝑡𝑛+1 , 𝑋𝑡𝑛+1 ) − 𝑢(𝑡𝑛 , 𝑋𝑡𝑛 ) of the solution exists. Number of time intervals is the number of steps
( ) used in the explicit Euler method, which is also the number of layers in
≈ −𝑓 𝑡𝑛 , 𝑋𝑡𝑛 , 𝑢(𝑡𝑛 , 𝑋𝑡𝑛 ), 𝜎 𝑇 (𝑡𝑛 , 𝑋𝑡𝑛 )∇𝑢(𝑡𝑛 , 𝑋𝑡𝑛 ) 𝛥𝑡𝑛 (3.2) the network. Learning rates represent our choices of what rate to use
𝑇
+ [∇𝑢(𝑡𝑛 , 𝑋𝑡𝑛 )] 𝜎(𝑡𝑛 , 𝑋𝑡𝑛 )𝛥𝑊𝑛 corresponds to the learning rate boundaries, i.e., 5×10−3 is the learning

3
S. Koohy, G. Yao and K. Rubasinghe Partial Differential Equations in Applied Mathematics 7 (2023) 100499

Fig. 1. Rough sketch of the architecture of the deep BSDE solver17 .

Fig. 2. Left: Approximated solution 𝑢(𝑇 = 0.4, 𝑥 = (0.8, 0.5, 1.0, 0.0, 0.2)) to (4.3) using BSDE solver for different iterations of the network. Right: Relative errors of the approximation
from the BSDE solver at different iterations of the network.

Table 1 between those runs is plotted. For each 𝑇𝑛 we plot the relative error on
Network parameters used to produce Fig. 2 and Fig. 3. the right of the approximated solutions. Note that the relative error at
Initial starting range [0.5, 0, 6] 𝑇 = 0.0 is 0, and is excluded it from the plot.
Number of time intervals 30
The results show that the BSDE solver can perform extremely well,
Learning rates 5 × 10−2 , 4 × 10−3
Learning rate boundary 500 especially only using 30 time steps. This is one of the advantages of the
Number of iterations 1000 BSDE solver, i.e., the accuracy is not restricted to small time steps in
explicit Euler method as it was a limitation for traditional numerical
techniques.

rate for the first 500 iterations, and a smaller learning rate, 4 × 10−3 , is 4.1. 5D test problem: Adaptive deep neural network
used for the remaining iterations.
It is possible to approximate solutions to (4.1) at any point. In When using this solver, the quickest way to achieve convergence on
the following, we present an approximated solution of (4.1) using the a solution is to initialize the starting point between a range you believe
BSDE solver and its performance at the point (𝑇 = 0.4, 𝑥 = (0.8, 0.5, 1.0, the solution lies in. The issue with this is, you may not always know
0.0, 0.2)). The analytical solution to (4.1) is 0.965𝑒−0.08 at this point where the solution is. We consider (4.1) again and focus on the point
obtained using (4.2). (𝑇 = 0.2, 𝑥 = (0.8, 0.5, 1.0, 0.0, 0.2)) as seen in Fig. 3. The analytical
The relative error used in the plots throughout the paper is defined solution was found to be 0.927162 at this point. Next, we compare
as, the approximation from the solver with the analytical solution to see if
|Analytical Solution − BSDE Solution| the BSDE solver is able to find a good approximation even with a very
Relative Error = (4.4)
Analytical Solution distant initialization. In a situation like this, it is a good idea to heavily
modify the learning rates.
Fig. 2 shows the approximated solution on the left and the relative We explore the solution given from these parameters for a single
error on the right computed by the BSDE solver at the point (𝑇 = run. For case one, our network is initialized at 817.345 as its starting
0.4, 𝑥 = (0.8, 0.5, 1.0, 0.0, 0.2)) for (4.1). The shaded region depicts the solution. Plots of the first 400 iterations, the last 1000 iterations, and
mean ± one standard deviation from five independent runs. The figures the relative error over all iterations for a single run can be found in
show that the algorithm converges to an accurate solution (less than Fig. 4. Case 2 represents a slightly closer initialization, which is at
0.1% relative error) after around 500 iterations, although five iterations 9.706. We use the network parameters found in Table 2, to achieve
already give less than 10% relative error. the solution. Fig. 5 represents the plots of the BSDE solution for the
The time evolution of (4.1), using the same 𝑥-point used in Fig. 2 first 200 iterations, the last 500 iterations, and the relative error over
is presented below. We use the following recursive relationship for the all iterations.
time value to be used: 𝑇0 = 0, 𝛥𝑇 = 0.1, 𝑇𝑛 = 𝑡0 + 𝑛𝛥𝑇 , 𝑛 = 1, … , 6. It can be seen from Figs. 4 and 5 that using both initializations
Each 𝑇𝑛 is run for 5 independent runs, then the average (by the solver) the network converges to the same approximate solution. With an

4
S. Koohy, G. Yao and K. Rubasinghe Partial Differential Equations in Applied Mathematics 7 (2023) 100499

Fig. 3. Analytical and approximated solution of the test problem (4.1) at 𝑥 = (0.8, 0.5, 1.0, 0.0, 0.2) for the time evolution, along side the corresponding relative error.

Table 2
Network parameters used to solve the test problem (4.1) at distant initial guesses.
Initial starting range Learning rates
Learning rate boundary
30, 3, 0.5, 0.3, 0.05, 0.01, 0.006, 0.003
Case 1 [800, 900]
250, 500, 600, 1250, 1500, 1700, 1850, 6000
5, 0.5, 0.06, 0.004, 0.0004
Case 2 [9.5,10]
50, 150, 300, 1200, 1500

Fig. 4. Left: First 400 iterations with an initialization at 817.345. Middle: The last 1000 iterations with an initialization at 817.345. Right: The Relative error over all iterations
with an initialization at 817.345.

Fig. 5. Left: First 200 iterations with an initialization at 9.706. Middle: The last 500 iterations with an initialization at 9.706. Right: The Relative error over all iterations with
an initialization at 9.706.

initialization of 817.345, the network begins to make an accurate solution. Another major network parameter that we will explore in this
approximation around the 175th iteration. When the initialization is subsection is the activation function.
slightly closer to the analytical solution the network makes an accurate
We reconsider the time evolution of (4.1) for 𝑥 = (0.8, 0.5, 1.0, 0.0,
approximation around the 75th iteration. Both plots show oscillatory
0.2). In all previous examples, the ReLU activation function was used
behaviors of the network’s solution during these single runs.
to compute results. We present the following activation functions to be
used for experimentation in this subsection, with input 𝑧:
4.2. 5D test problem: Hyper-parameter comparison
• Sigmoid takes a real value as input and outputs another value
As seen in the previous subsection, we decided to change the between 0 and 1:
initialization zone, the number of iterations, and learning rates to
1
see if we could still retrieve a good approximation of the analytical 𝜎(𝑧) = (4.5)
1 + 𝑒−𝑧

5
S. Koohy, G. Yao and K. Rubasinghe Partial Differential Equations in Applied Mathematics 7 (2023) 100499

Table 3
Network parameters used in the BSDE solver for the
Allen–Cahn equations in 1D, 3D and 60D.
Initial starting range [0.4, 0.5]
Number of time intervals 30
Learning rates 4 × 10−2 , 4 × 10−4
Learning rate boundary 1000
Number of iterations 2000

In this section, we investigate the Allen–Cahn equation of the fol-


lowing form:
⎧ 𝜕𝑢(𝑡, 𝑥) 1
⎪ − 𝛥𝑢 − 𝑓 ′ (𝑢) = 0, (𝑥, 𝑡) ∈ 𝛺 × (0, 𝑇 ]
⎪ 𝜕𝑡 𝜖2
⎨ 𝜕𝑢(𝑡, 𝑥) = 0, 𝑡 ∈ (0, 𝑇 ] (5.1)
⎪ 𝜕𝑛
Fig. 6. Activation functions tested in the BSDE solver for the 5D test problem. ⎪ 𝑢(0, 𝑥) = 𝑔(𝑥) 𝑥∈𝛺

where 𝛺 ∈ R𝑑 is a bounded domain. Note that the solution, 𝑢(𝑡, 𝑥),
represents one of the concentrations of one of the two metallic compo-
• The rectified linear activation function, or ReLU for short, is a
nents that make up the alloy, 𝜖 > 0 is known as the interaction length
piecewise linear function that will output the input directly if it
or interfacial width38,39 and 𝑓 (𝑢) is a nonlinear function representing
is positive, otherwise, it will output zero:
{ a polynomial double-well potential. The double-well function used can
𝑧, 𝑧 ≥ 0 be computed by the following equation
𝑅𝑒𝐿𝑈 (𝑧) = max(0, 𝑧) = (4.6)
0, 𝑧 < 0
1 𝛼
𝑓𝛼 (𝑢) = (𝑢 − 1)2 (5.2)
• Hyperbolic tangent function tanh(𝑧) 4
• Exponential Linear Unit, or ELU, is a non-saturating activation The double-well potential is considered or modeled due to the
function that tends to converge to a solution quicker than ReLU: nature of phase separation requiring at-least two distinct phases to
{ occur from the original source. A double-well potential gives us two
𝑧, 𝑧≥0 functional minima as shown in Fig. 8. The network parameters used
𝐸𝐿𝑈 (𝑧) = (4.7)
𝛼(𝑒𝑧 − 1), 𝑧 < 0 can be found in Table 3. The focus of these numerical experiments
is to examine the performance of the BSDE solver for the Allen–Cahn
• Softplus is a smooth approximation to the ReLU function and can
equation in the following senior:
be used to constrain the output of a machine to always be positive:
1. Higher-order potential functions: Recently, the AC equation with
𝑆𝑜𝑓 𝑡𝑝𝑙𝑢𝑠(𝑧) = ln(𝑒𝑧 + 1) (4.8) a high-order polynomial was introduced to better represent the
interfacial dynamics41,42 . This is referred to as the hAC equation.
Fig. 6 shows the profile of all five activation functions. These func- For the higher order 𝑓𝛼 (𝑢) we consider the following: 𝛼 is an even
tions are plotted on a grid to better represent the linear, exponential, integer to which we will experiment with 𝛼 = 2, 4, 6, and 8, as
and asymptotic behaviors that the functions possess. The numerical seen in42 . The potential functions’ profiles can be found in Fig. 8,
approximations by different activation functions and the analytical where 𝛼 = 2 is the classical double well potential function often
solution are shown on the left of Fig. 7. Relative errors are plotted as a used in literature. To the best of our knowledge, there exists little
function of time on the right of Fig. 7. Here, the relative errors at 𝑇 = 0 research on numerical simulations of the Allen–Cahn equation
are 0 for all activation functions and we exclude it from the plot. Note with higher order potential functions. Substituting (5.2) into
that 𝛼 = 1, in the exponential linear unit activation function (4.7). (5.1) we obtain the following equation
It can be seen that functions like Softplus and Sigmoid perform 𝜕𝑢(𝑡, 𝑥) 𝛼 2𝛼−1
better for 𝑇 = 0.1 and 𝑇 = 0.3, however generally for all other values of − 𝛥𝑢 − (𝑢 − 𝑢𝛼−1 ) = 0. (5.3)
𝜕𝑡 2𝜖 2
𝑇 the relative error is larger than what is seen for the ReLU function.
This is strictly the case for 𝑇 = 0.6. Other functions perform similarly 2. Different parameter 𝜖: We consider 𝜖 = 1, 1∕3, 1∕6, and 1∕9 when
to ReLU but are unable to stay consistent throughout all values of 𝑇 . the approximation is acceptable. As shown in Fig. 9, as 𝜖 gets
The ReLU function gives the most consistent and accurate solutions. smaller, numerical approximations tend to become worse.
The hyperbolic tangent function provides the closest representation to 3. Different dimensions in space: We hope to present solutions for
the ReLU error except for 𝑇 = 0.2 and 𝑇 = 0.6. There is not one single higher-order dimensions such as when 𝑑 = 1, 3, and 60.
function that seems to perform the worst, but all have their points of
good and lesser good approximations throughout 𝑇 . We refrain from Applying the time reversal mapping seen in the previous section in
calling any of the activation functions error bad unacceptable, as the (4.3) and the explicit form of our potential function in (5.3), we obtain
largest relative error found was 2.3%. the terminal condition problem to be used in our solver as follows:
⎧ 𝜕𝑢(𝑡, 𝑥) 𝛼 𝛼−1
5. The Allen–Cahn equation ⎪ + 𝛥𝑢 + (𝑢 − 𝑢2𝛼−1 ) = 0, (𝑥, 𝑡) ∈ 𝛺 × (0, 𝑇 ]
⎪ 𝜕𝑡 2𝜖 2
⎨ 𝜕𝑢(𝑡, 𝑥) = 0, 𝑡 ∈ (0, 𝑇 ] (5.4)
The Allen–Cahn equation is used to describe the phase-separation ⎪ 𝜕𝑛
or phase-transition of multi-component alloy systems and crystalline ⎪ 𝑢(𝑇 , 𝑥) = 𝑔(𝑥), 𝑥 ∈ 𝛺.

solids, first introduced by Samuel Allen and John Cahn in 197934 . The
applications of the Allen–Cahn have since spread much wider to image 5.1. Allen-Cahn: 1D, 𝛼 = 2, varying 𝜖
segmentation35,36 , finding the mean curvature of multi-dimensional
surfaces37,38 along with many other uses in fluid dynamics and material We begin by exploring the solutions to multiple 𝜖 values, for 𝛼 = 2
science39,40 . and 𝑥 ∈ R. This is the only case where the analytical solution to the

6
S. Koohy, G. Yao and K. Rubasinghe Partial Differential Equations in Applied Mathematics 7 (2023) 100499

Fig. 7. Left: Time evolution of the analytical solution and approximations using different activation functions for the 5D test problem (4.1) at 𝑥 = (0.8, 0.5, 1.0, 0.0, 0.2). Right:
Corresponding relative errors obtained by using different activation functions with respect to time.

It can be seen from both Figs. 11 and 12 that with the given
parameters, increasing the dimensions of the problem provides similar
behaviors and shapes to the solution. When looking at our various
potential functions graph in Fig. 8, for 𝑢 = 1 or in the surrounding
region, 𝑓𝛼 (𝑢) approaches zero along with becoming a local minimum
(mentioned in Section 5). Examining Fig. 12 we see that when 𝜖 = 1∕3,
the solution increases much quicker than when 𝜖 = 1. We believe that
for larger time intervals the solution will flatten out over time and
converge to a single value. We can also see that for larger 𝜖 the solution
tends to increase much slower (than using 𝜖 < 1).
It should be noted that there was an issue when using the parame-
ters inTable 3 for the point 𝑡 = 0.16, 𝑥 ∈ R60 in Fig. 12. Attempting
to run this point in the solver arose an issue where the network’s
loss returned ‘‘NaN’’ (Not a number). When examining the iteration
before this occurred it seemed as if the loss was strangely growing and
Fig. 8. Higher order polynomial double well potential functions (5.2) with 𝛼 = 2, 4, moving towards infinity. We solved this issue by changing the network
6, and 8.
parameters, only for this single point. The number of time intervals was
increased from 30 to 300 and a constant learning rate of 0.04 was used
to run the simulation for 100 iterations. This fixed the issue and gave us
Allen–Cahn equation can be explicitly found and the analytical solution a solution that follows the correct overall behavior of the solution. The
to the initial condition problem is given in13 as follows: code for these numerical experiments can be obtained upon request.
( ( 𝑥 − 0.5 − 𝑠𝑡 ))
1
𝑢(𝑡, 𝑥) = 1 − tanh √ , 𝑡 > 0, (5.5)
6. Conclusion
2 2 2𝜖

where 𝑠 = 3∕( 2𝜖). The initial condition can be given by plugging√ 𝑡=0 In this paper, we focus on 1D, 3D, 5D, and 60D numerical simula-
into the analytical solution, 𝑢(0, 𝑥) = 12 (1 − tanh((𝑥 − 0.5)∕(2 2𝜖))). tions of reaction–diffusion equations:
We present numerical experiments for 1-dimensional problem for
a. 5D Test Problem: We started with creating a test 5D problem,
𝑥 ∈ [0, 4] when 𝜖 = 1, 1∕3, 1∕6, 1∕9 and 𝑡 = 0.0113 . The analytical
with a known analytical solution to test the efficiency and accu-
solution, approximated solution, and the relative errors are shown in
racy of the deep neural network. We analyzed the time evolution
Fig. 9 on the left, middle, and right respectively. It can be seen that
of our 5D reaction–diffusion equation at the origin. The relative
the approximation gets better when 𝜖 is larger. Furthermore, the BSDE
errors were about 0.1%, and in some cases reaching as small
solver can handle 𝜖 as small as around 1∕9 when 𝛼 = 2. Time evolution
as 0.01%. This behavior of outstanding relative errors echoes
of the approximated solutions 𝑢(𝑡, 𝑥 = 0) for 𝑡 ∈ [0, 0.16] is presented
throughout the remaining experiments.
on the left of Fig. 10 for different values of 𝛼. As we are not aware
b. The Neural Network: The network typically performs best when
of other analytical solutions for varying 𝛼, we use the network’s loss
the initial guess of the solution is chosen in a range that is
values as a replacement for the relative error. Right of Fig. 10 shows
believed to be the analytical solution. However, through ma-
the loss profile.
jor hyper-parameter changes, such as the number of iterations
and the learning rates, we were able to show that the network
5.2. Allen-Cahn: 3D and 60D, varying 𝛼 and 𝜖 converges towards the analytical solution even when initialized
far from the true solution. Throughout the iterations of the net-
Next, we experiment with the higher dimensional Allen–Cahn equa- work, the relative error generally stayed bounded between 1%
tion (5.1), for 𝑥 ∈ R3 and 𝑥 ∈ R60 . We approximate 𝑢(𝑇 = 0.16, 0) and 0.01%, further showing the capacity of the deep neural
using two sets of parameters on AC equations. Approximated solutions network. To keep with the theme of modification towards hyper-
and loss profiles over [0, 0.16] are illustrated. Fig. 11 shows numerical parameters, we altered the activation function used. The solver
results using the first set of parameters, which is 𝜖 = 1 and 𝛼 = 2, 4 and originally uses the well-known ReLU function and we then tried
Fig. 12 corresponds to the second set of parameters, which is 𝛼 = 2 and other activation functions such as the sigmoid, the hyperbolic
𝜖 = 1, 1∕3. tangent, ELU, and the softplus functions. In most cases, the ReLU

7
S. Koohy, G. Yao and K. Rubasinghe Partial Differential Equations in Applied Mathematics 7 (2023) 100499

Fig. 9. AC 1D for 𝜖 = 1, 13 , 16 , 1
9
when 𝑡 = 0.01: Left: analytical solution Middle: BSDE approximation. Right: relative error.

Fig. 10. Left: Time evolution of (5.4) with 𝜖 = 1 and for 𝛼 = 2, 4, 6, and 8. Right: Corresponding loss values.

Fig. 11. Top Left: Time evolution of 3D AC with 𝜖 = 1 and 𝛼 = 2, 4 Top Right: Calculated loss by the network for the time evolution of 3D AC with 𝜖 = 1 and 𝛼 = 2, 4 Bottom
Left: Time evolution of 60D AC with 𝜖 = 1 and 𝛼 = 2, 4 Bottom Right: Calculated loss by the network for the time evolution of 60D AC with 𝜖 = 1 and 𝛼 = 2, 4.

8
S. Koohy, G. Yao and K. Rubasinghe Partial Differential Equations in Applied Mathematics 7 (2023) 100499

Fig. 12. Top Left: Time evolution of 3D AC with 𝛼 = 2 and 𝜖 = 1, 1∕3 Top Right: Calculated loss by the network for the time evolution of 3D AC with 𝛼 = 2 and 𝜖 = 1, 1∕3 Bottom
Left: Time evolution of 60D AC with 𝛼 = 2 and 𝜖 = 1, 1∕3 Bottom Right: Calculated loss by the network for the time evolution of 60D AC with 𝛼 = 2 and 𝜖 = 1, 1∕3.

performed the best and most consistently, besides a few excep- loss and solution become very inconsistent, erratic, and unstable.
tions where the hyperbolic tangent, softplus sigmoid function The ability of the solver to converge quickly to a fair approx-
outperformed at certain points in time. We also studied another imation of the solution when the initial guess is closer to the
reaction–diffusion equation, the Allen–Cahn (AC) equation in analytical solution is very impressive. One downside of the solver
both low and high-dimensional space. in its current state is that it can only find the solution at a specific
c. The Allen–Cahn Equation: The Allen–Cahn equation contains point in time and space, rather than providing solutions over a
two major parameters that we decided to perform analysis on,
region 𝐷 ⊂ R. Ref.21 provided a solution for such problems over
𝜖 the interaction length, and 𝛼, a potential function parameter.
a region, however, it is time-consuming but still is reasonable for
Our first analysis was done in 1D, finding the spatial solution of
the AC equation with 𝛼 = 2 and varying 𝜖 at a final time of 𝑡 = realistic high-dimensional problems.
0.01, to which there is a known analytical solution. The relative
A major benefit of this solver and the method of finding local solu-
error for these approximations was found to be outstandingly
tions to second-order semilinear parabolic PDEs is that computational
ranging between 1% and 0.00001%. Furthermore, we observed
that the smaller the 𝜖 was, the larger the error endured. speed is not affected by the size of the spatial dimension. From what
When working in 3D and 60D Allen Cahn equations under two dif- we have experimented with, the speed and accuracy of the network are
ferent situations we found the loss values of the approximations not solely determined by the dimensions. The network is flexible with
to commonly be between 10−2 to 10−4 in some cases reaching as the ability to input various types of semilinear parabolic PDEs for it to
small as 10−6 . solve.
Through extensive experiments and comparisons of our approxi-
mations with analytical solutions to the 1D Allen–Cahn equation
(AC), we found that the algorithm is extremely accurate, even Declaration of competing interest
with small interaction length parameter 𝜖. In higher-dimensional
space, we focus on the time evolution of the solution at the origin
The authors declare that they have no known competing finan-
using various 𝜖 and order of the potential function 𝛼. The solution
cial interests or personal relationships that could have appeared to
to the AC as a function of time at the origin changes slightly as
influence the work reported in this paper.
the dimension increases. The algorithm was able to approximate
the solution with an interaction length of 𝜖 = 1∕9 for the final
time 𝑇 = 0.16 even in 60D. As the order of the potential function
Data availability
increases, the solution increases at a slower rate.
We have experienced a limitation with the solver when working
with the Allen–Cahn equation. When using 𝜖 < 1∕9 the network’s Data will be made available on request.

9
S. Koohy, G. Yao and K. Rubasinghe Partial Differential Equations in Applied Mathematics 7 (2023) 100499

References 21. Han J, Lu J, Zhou M. Solving high-dimensional eigenvalue problems using


deep neural networks: A diffusion Monte Carlo like approach. J Comput Phys.
1. Berchtold S, Böhm C, Kriegal HP. The pyramid-technique: Towards breaking the 2020;423:109792.
curse of dimensionality. In: Proceedings of the 1998 ACM SIGMOD International 22. Saneifard R, Jafarian A, Ghalami N, Nia SM. Extended artificial neural
Conference on Management of Data. 1998:142–153. networks approach for solving two-dimensional fractional-order Volterra-type
2. Berisha V, Krantsevich C, Hahn PR, et al. Digital medicine and the curse of integro-differential equations. Inform Sci. 2022;612:887–897.
dimensionality. NPJ Dig Med. 2021;4(1):1–8. 23. Jafarian A, Rezaei R, Golmankhaneh A. On solving fractional higher-order
3. Charles V, Aparicio J, Zhu J. The curse of dimensionality of decision-making units: equations via artificial neural networks. Iran J Sci Technol Trans A Sci.
A simple approach to increase the discriminatory power of data envelopment 2022;46(2):535–545.
analysis. European J Oper Res. 2019;279(3):929–940. 24. Jafarian A, Nia SM, Golmankhaneh AK, Baleanu D. On artificial neural networks
4. Chen W, Wang S. A 2nd-order ADI finite difference method for a 2D fractional approach with new cost functions. Appl Math Comput. 2018;339:546–555.
Black–Scholes equation governing European two asset option pricing. Math Comput 25. Grohs P, Hornung F, Jentzen A, Von Wurstemberger P. A proof that artificial
Simulation. 2020;171:279–293. neural networks overcome the curse of dimensionality in the numerical ap-
5. Li J, Ju L, Cai Y, Feng X. Unconditionally maximum bound principle preserving proximation of Black–Scholes partial differential equations. 2018 arXiv preprint
linear schemes for the conservative Allen–Cahn equation with nonlocal constraint. arXiv:1809.02362.
J Sci Comput. 2021;87(3):1–32. 26. Kissas G, Yang Y, Hwuang E, Witschey WR, Detre JA, Perdikaris P. Machine
6. Hutzenthaler M, Nguyen TA. Multilevel Picard approximations of high-dimensional learning in cardiovascular flows modeling: Predicting arterial blood pressure from
semilinear partial differential equations with locally monotone coefficient non-invasive 4D flow MRI data using physics-informed neural networks. Comput
functions. Appl Numer Math. 2022. Methods Appl Mech Engrg. 2020;358:112623,
7. Becker R, Brunner M, Innerberger M, Melenk JM, Praetorius D. Rate-optimal 27. Liu M, Liang L, Sun W. A generic physics-informed neural network-based
goal-oriented adaptive FEM for semilinear elliptic PDEs. Comput Math Appl. constitutive model for soft biological tissues. Comput Methods Appl Mech Engrg.
2022;118:18–35. 2020;372:113402,
8. Zhang H, Yan J, Qian X, Song S. Numerical analysis and applications of explicit 28. Beck C, Gonon L, Jentzen A. Overcoming the curse of dimensionality in the nu-
high order maximum principle preserving integrating factor Runge–Kutta schemes merical approximation of high-dimensional semilinear elliptic partial differential
for Allen–Cahn equation. Appl Numer Math. 2021;161:372–390. equations. 2020 arXiv preprint arXiv:2003.00596.
9. Yang X. A novel fully-decoupled, second-order and energy stable numerical 29. Reisinger C, Zhang Y. Rectified deep neural networks overcome the curse of
scheme of the conserved Allen–Cahn type flow-coupled binary surfactant model. dimensionality for nonsmooth value functions in zero-sum games of nonlinear
Comput Methods Appl Mech Engrg. 2021;373:113502. stiff systems. Anal Appl. 2020;18(06):951–999.
10. Yang X. A novel decoupled second-order time marching scheme for the two-phase 30. Mattey R, Ghosh S. A novel sequential method to train physics informed neural
incompressible Navier–Stokes/Darcy coupled nonlocal Allen–Cahn model. Comput networks for Allen Cahn and Cahn Hilliard equations. Comput Methods Appl Mech
Methods Appl Mech Engrg. 2021;377:113597. Engrg. 2022;390:114474.
11. Deteix J, Kouamo GN, Yakoubi D. A new energy stable fractional time stepping 31. Davis E, Yao G, Javor E, Rubasinghe K, Galván LAT. A test of backward stochastic
scheme for the Navier–Stokes/Allen–Cahn diffuse interface model. Comput Methods differential equations solver for solving semilinear parabolic differential equations
Appl Mech Engrg. 2022;393:114759. in 1D and 2D. Partial Differ Equ Appl Math. 2022;6:100457.
12. Bartels A, Mosler J. Efficient variational constitutive updates for Allen–Cahn-type 32. Evans L. An Introduction to Stochastic Differential Equations, Vol. 82. American
phase field theory coupled to continuum mechanics. Comput Methods Appl Mech
Mathematical Society; 2012.
Engrg. 2017;317:55–83.
33. Oksendal B. Stochastic Differential Equations: An Introduction with Applications.
13. Niu J, Xu M, Yao G. An efficient reproducing kernel method for solving the
Springer Science & Business Media; 2013.
Allen–Cahn equation. Appl Math Lett. 2019;89:78–84.
34. Allen SM, Cahn JW. A microscopic theory for antiphase boundary motion and its
14. Tatari M, Dehghan M. On the solution of the non-local parabolic par-
application to antiphase domain coarsening. Acta Metall. 1979;27(6):1085–1095.
tial differential equations via radial basis functions. Appl Math Model.
35. Beneš M, Chalupeckỳ V, Mikula K. Geometrical image segmentation by the
2009;33(3):1729–1738.
Allen–Cahn equation. Appl Numer Math. 2004;51(2–3):187–205.
15. Dehghan M, Tatari M. Use of radial basis functions for solving the second-order
36. Lee D, Lee S. Image segmentation based on modified fractional Allen–Cahn
parabolic equation with nonlocal boundary conditions. Numer Methods Partial
equation. Math Probl Eng. 2019;2019.
Differ Equ. 2008;24(3):924–938.
37. Choi Y, Jeong D, Lee S, Yoo M, Kim J. Motion by mean curvature of curves on
16. Postavaru O, Toma A. Numerical solution of two-dimensional fractional-order
surfaces using the Allen–Cahn equation. Internat J Engrg Sci. 2015;97:126–132.
partial differential equations using hybrid functions. Partial Differ Equ Appl Math.
38. Feng X, Prohl A. Numerical analysis of the Allen–Cahn equation and
2021;4:100099.
approximation for mean curvature flows. Numer Math. 2003;94(1):33–65.
17. Han J, Jentzen A. Deep learning-based numerical methods for high-dimensional
39. Shen J, Yang X. Numerical approximations of Allen–Cahn and Cahn–Hilliard
parabolic partial differential equations and backward stochastic differential
equations. Commun Math Stat. 2017;5(4):349–380. equations. Discrete Contin Dyn Syst. 2010;28(4):1669.
18. Han J, Jentzen A, E W. Solving high-dimensional partial differential equations 40. Aihara S, Takaki T, Takada N. Multi-phase-field modeling using a conservative
using deep learning. Proc Natl Acad Sci. 2018;115(34):8505–8510. Allen–Cahn equation for multiphase flow. Comput & Fluids. 2019;178:141–151.
19. E W, Han J, Jentzen A. Algorithms for solving high dimensional PDEs: from 41. Fusco G. Periodic motions for multi-wells potentials and layers dynamic for the
nonlinear Monte Carlo to machine learning. Nonlinearity. 2021;35(1):278. vector Allen–Cahn equation. J Dynam Differential Equations. 2021:1–51.
20. Güler B, Laignelet A, Parpas P. Towards robust and stable deep learning algorithms 42. Lee C, Kim H, Yoon S, et al. An unconditionally stable scheme for the Allen–Cahn
for forward backward stochastic differential equations. 2019 arXiv preprint arXiv: equation with high-order polynomial free energy. Commun Nonlinear Sci Numer
1910.11623. Simul. 2021;95:105658.

10

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy