Representing Local Atomic Environment Using Descriptors Based On Local Correlations
Representing Local Atomic Environment Using Descriptors Based On Local Correlations
Representing Local Atomic Environment Using Descriptors Based On Local Correlations
A. Samanta
September 7, 2018
This document was prepared as an account of work sponsored by an agency of the United States
government. Neither the United States government nor Lawrence Livermore National Security, LLC,
nor any of their employees makes any warranty, expressed or implied, or assumes any legal liability or
responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or
process disclosed, or represents that its use would not infringe privately owned rights. Reference herein
to any specific commercial product, process, or service by trade name, trademark, manufacturer, or
otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the
United States government or Lawrence Livermore National Security, LLC. The views and opinions of
authors expressed herein do not necessarily state or reflect those of the United States government or
Lawrence Livermore National Security, LLC, and shall not be used for advertising or product
endorsement purposes.
Representing local atomic environment using descriptors based on local correlations
Amit Samanta∗
Physics Division, Lawrence Livermore National Laboratory, Livermore, California 94550, USA
(Dated: December 21, 2018)
Statistical learning of material properties is an emerging topic of research and has been tremendously suc-
cessful in areas such as representing complex energy landscapes as well as in technologically relevant areas,
like identification of better catalysts and electronic materials. However, analysis of large data sets to efficiently
learn characteristic features of a complex energy landscape, for example, depends on the ability of descriptors
to effectively screen different local atomic environments. Thus, discovering appropriate descriptors of bulk or
defect properties and the functional dependence of such properties on these descriptors remains a difficult and
tedious process. To this end, we develop a framework to generate descriptors based on many-body correlations
that can effectively capture intrinsic geometric features of the local environment of an atom. These descriptors
are based on the spectrum of two-body, three-body, four-body and higher order correlations between an atom
and its neighbors, and are evaluated by calculating the corresponding two-body, three-body, four-body over-
lap integrals. They are invariant to global translation, global rotation, reflection and to permutations of atomic
indices. By systematically testing the ability to capture local atomic environment, it is shown that the local cor-
relation descriptors are able to successfully reconstruct structures containing 10-25 atoms which was previously
not possible.
polynomial approximation with pairwise and angular depen- II. DESCRIPTORS BASED ON LOCAL DENSITY
dent descriptors can be used to generate MLP with improved CORRELATIONS
transferability and accuracy. The deep potential molecular
dynamics method is another neural network based potential, To design descriptors that embed local correlations, we first
but it uses descriptors based on the Coulomb matrix and its consider how the effective atomic density of an atom is af-
variants.33 fected by the presence of other atoms. To this end, let us con-
sider a structure SN at a point X ∈ R3N in the configuration
space containing N atoms, located at {r1 , r2 , r3 , · · · , rN },
Bartok et al. in Ref. [9] tested these descriptors by prob- such that ri ∈ R3 for i = 1, 2, ... , N. The atomic density
ing their ability to reconstruct small clusters. Using many de- of such a system is typically written as
scriptors (i.e. three and four-dimensional power spectrum, bi-
N
spectrum, Parrinello-Behler type descriptors and Chebyshev X
polynomials), the authors were able to successfully recon- ρ (r) = δ (r − ri ) , (1)
struct clusters containing 4 to 12 Si atoms to a varying level i=1
of accuracy. But, for reasons not clear, all of these descrip- which is a superposition of the densities of all the N atoms.
tors were not suitable for reconstruction of clusters containing Here, the effective density of each atom is approximated by
more than 12 atoms. A majority of these descriptors specifi- a Dirac δ-function. It is important to note that the effective
cally include two-body and three-body correlations, but their atomic density (which is closely related to the accessible free
inability to reconstruct larger clusters perhaps suggests that a volume) is affected by the local symmetry, local packing den-
systematic analysis of the importance of many-body correla- sity and the spatial separation between neighbors. Hence, a
tions to describe local environments is important. The com- simple superposition of spatially separated Dirac δ-functions
putational cost of a machine learned potential is determined as shown in Eq. 1, completely ignores the correlation between
to a large extent by factors like how many descriptors are re- an atom and its immediate neighbors. A simple solution is to
quired to describe the local environment and the number of replace the δ-functions in Eq. 1 by effective densities that can
operations required to calculate each descriptor. Thus, the be calculated by starting with a guess density and by system-
development of descriptors that can accurately capture local atically accounting for the overlap between an atom and its
environments, so that they can be used to reconstruct small neighbors.
clusters (containing less than 10-12 atoms) or large clusters To this end, we assume the density of an atom, with in-
(containing more than 10-12 atoms), is important to improve dex i, when it is isolated from any other atoms is represented
the accuracy and transferability, and decrease the computa-
tional cost of machine learned potentials. R ∞a smooth and differentiable function ρ1 (r, ri ), such that
by
−∞ 1
ρ (r, ri ) dr = 1. We represent the effective density of
this atom, when placed amongst a distribution of other atoms
(placed at r1 , r2 , · · · , rN i ), by
We propose a framework to generate descriptors based on
many-body correlations and illustrate that they can effectively ρ̄ (r, ri |r1 , r2 , · · · , rN i ) = p1 − p2 + p3 − p4 + · · · , (2)
capture intrinsic geometric features of the local environment
of an atom. These descriptors are based on the spectra of vari- and this effective density can be used to replace the Dirac δ-
ous two-body, three-body, four-body and higher order correla- functions in Eq. 1. Here, p1 = ρ1 (r, ri ) is the probability of
tions between an atom and its surroundings. Furthermore, we placing the atom at ri , Ni is the number of neighbors of i and
have explored the relationship between the spectra of these pk is the overlap between the densities of the atom at ri and
correlation matrices and the spectrum of the adjacency ma- its k neighbors. For example, two-body correlations can be
trix of a regular connected graph. Using these descriptors we captured by the overlap between atom i and its neighbors (see
were able to successfully reconstruct clusters with sizes rang- Fig. 1)
ing from 5 to 25 atoms which was not possible using exist- X
ing descriptors. This suggests that descriptors that incorpo- p2 = ρ1 (r, ri ) ρ1 (r, rj ) . (3)
rate many-body correlations are important to properly embed j
neighborhood information in MLP.
Thus, when only two-body correlations are present, the effec-
tive density of an atom i that captures the overlap between the
The remainder of the paper takes the following form. The densities of the atom i and its neighbor j is given by
importance of local correlations is discussed in Section II, X
which is followed by a detailed discussion of the procedure p̄ = [ρ1 (r, ri ) − ρ1 (r, ri ) ρ1 (r, rj )]
j
used to obtain two-body, three-body, four-body and five-body h i (4)
correlation descriptors in Sections II A-II D. In Section III,
X
= ρ1 (r, ri ) 1 − ρ1 (r, rj ) .
the descriptors developed in Section II were tested by per- j
forming reconstruction simulations of clusters containing 5 to
25 atoms. This is followed by discussions in Section IV and Here, the sum is over all possible neighbors of atom i in the
summary in Section V. system. Similarly, the overlap between the atom i and two of
3
! ! and the effective density of the atom i, when both two, three
and four-body correlations are taken into account is given by
" # !" !#
X h ih i
p̄ = ρ1 (r, ri ) 1 − ρ1 (r, rj ) 1 − ρ1 (r, rk )
hjkli (8)
h i
× 1 − ρ1 (r, rl ) .
(b) ' #
%
'
FIG. 1: The schematic in 1(a) illustrates the relevance of & % $
two-body correlations: sets A and B, and sets A and C have (
&
non-zero overlap between them, while sets B and C do not
overlap. Similarly, the schematic in 1(b) illustrates the
relevance of both two and three-body correlations. There is a
finite overlap between any two sets and A∩B∩C is also finite.
When sets A, B and C represent guess atomic densities of
three atoms, then correlations captured in 1(a) and 1(b)
correspond to p2 and p3 in Eqs. 3 and 5, respectively. FIG. 2: A schematic representation of the mapping of
neighbors, numbered from 1 through 8, of an atom (shown in
blue) in a cluster (left) to a graph (right). The nodes of the
its neighbors (with indices j and k) is given by (see Fig. 1) graph correspond to the neighbors of the central atom, and
each edge corresponds to a bond between a pair of neighbors.
X In this case, each node j is connected to 7 other nodes with
p3 = ρ1 (r, ri ) ρ1 (r, rj ) ρ1 (r, rk ) , (5) (2) √
hjki
edges whose weights are given by ρjk 2σ , where k 6= j.
and the effective density of the atom i, when both two and For a systematic analysis of correlations, the atomic density
three-body correlations are taken into account, is given by ρ1 (r, ri ) of an isolated atom is approximated by a normalized
Gaussian distribution, i.e.
X h ih i
1 −|r−ri |2
p̄ = ρ1 (r, ri ) 1 − ρ1 (r, rj ) 1 − ρ1 (r, rk ) , (6) ρ1 (r, ri ) = e 2σ 2 . (9)
3/2
hjki (2πσ 2 )
where the sum is over all possible pairs of neighbors of atom Here, σ determines the spread of the Gaussian. While the
i. width σ is a parameter that can be cross-validated, for the re-
Following the same procedure as mentioned above, we see sults presented in Section III, σ was set to be equal to the dis-
4
tance corresponding to the local minimum between the first of an atom i to be located at the vertices of a graph
and the second peaks of the pair correlation function. Using G (see Fig. 2) and each neighbor j of i is connected
the single particle density in Eq. 9, in Sections II A, II B, to another vertex k (which is also a neighbor of vertex
II C and II D we present in detail the procedure used to obtain i) by an edge with weights given by wjk = wkj =
(2) √
two-body, three-body, four-body and five-body correlation de- −|rj −rk |2
scriptors. It is important to note here that using a Gaussian ρjk 2σ (1 − δjk ) = e 4σ2 (1 − δjk ). These
density distribution allows us to analytically evaluate the over- weights, obtained from a scalar-valued Gaussian kernel,
lap integrals. We also note that the analysis presented below are symmetric with respect to the interchange of posi-
can be generalized using more generic functions, for exam- tions of atoms j and k and provide a local measure of
ple by using Gaussian type orbitals. But, the results presented the proximity between two atoms. If we assume that all
in Section III show that this simple Gaussian function can neighbors of an atom i interact with each other via such
capture important geometric features of local neighborhood of weights, then each vertex in the graph G is connected
an atom. In order to calculate these correlation descriptors, a to all other vertices in G. Thus, there exists a path con-
knowledge of the spatial location of the neighbors of an atom necting any two vertices j and k, meaning that G is a
is important. In a typical molecular dynamics simulation, lo- connected regular graph.37 The adjacency matrix of G,
cal neighborhood information is preserved in the form of a denoted by K(2) , is a Ni × Ni real symmetric matrix
(2)
neighbor-list. Hence, we assume that atom i has Ni neighbors with entries Kjk = wjk .
that lie inside a sphere of radius ro centered at ri .
The adjacency matrix K(2) has many interesting prop-
erties: Since it is a symmetric matrix, its eigenvalues
are real. Further, the diagonal elements of K(2) are
A. Two-body correlations
zero meaning that the trace of K(2) , and the sum of
its eigenvalues is zero. Since, the two-body overlap
We start by describing the procedure used to obtain two- (2) √
function, ρjk 2σ (see Eq. 10), depends on the L2 -
body correlation descriptors. The overlap between the densi-
norm of the distance between two vertices, it is invari-
ties of two atoms i and j, located at ri and rj , respectively,
ant to global translation and rotation. From the Perron-
is
Frobenius theorem, we know that the largest eigen-
Z ∞ (2)
value, λmax , of K(2) lies between the average degree
C (2) (ri , rj ) = ρ1 (r, ri ) ρ1 (r, rj ) dr
−∞
and the maximum degree of the graph G and if G is bi-
∞ −|r−r |2 −|r−rj |2
partite then the smallest eigenvalue of K(2) is related
1
Z
i
(2) (2)
= 3 e 2σ 2 e
dr 2σ 2
to the largest eigenvalue: λmin = −λmax .37 The adja-
(2πσ 2 ) −∞ cency matrix (after proper normalization) is related to
1 1 √ √ −|ri −rj |2
(2) (2) the graph Laplacian that is commonly used in dimen-
= ρ 2σ , ρ 2σ = e 4σ 2
8 (πσ 2 )3/2 ij ij sion reduction techniques like Laplacian eigenmaps,
(10) diffusion maps, and clustering techniques, like spectral
clustering.35,36,38–40 Coifman et al. have shown that the
Using this overlap function, we propose two descriptors that diffusion distance (captured by the two-body overlap
can capture local geometric information: function ρ(2) ) is an important geometric quantity that
links the spectral theory of the Markov process to the
i. Let u(1) be a vector that quantifies
h the overlap betweenii geometry, density and distribution of the data.36 Fur-
(1) (2) (2) (2)
and its neighbors, i.e. u = ρi1 ρi2 ρi3 · · · ρiNi ,
(2) ther, it was shown that a small subset of the eigenfunc-
(2) (2) √ (2) (2) √ tions of the graph Laplacian can be used to construct
where ρi1 = ρi1 2σ , ρi2 = ρi2 2σ , etc. a low-dimensional geometric embedding of the high-
(2)
Since, ρij depends on the norm of the distance be- dimensional data set.35,36,40 Thus, the set of Ni eigen-
2 values (denoted by the Ni -dimensional vector u(2) ) of
tween the atom i and its neighbor j, i.e. |ri − rj | ,
K(2) is a descriptor that captures the essential geomet-
it is invariant to global translation, and other unitary
ric features of the local environment of atom i. These
transformations such as global rotation, i.e. if U is an
eigenvalues are invariant to global translation, rotation
unitary matrix and if ri → Uri and rj → Urj , then
2 2 and to permutations of atomic indices of the neighbors.
|Uri − Urj | = |ri − rj | . The descriptor u(1) is,
however, not invariant to permutation of two or more We note here that graph kernels, which are actively used to
atomic indices. A simple way to make u(1) invariant to compare graphs, encode intrinsic geometric features and are
permutations of atomic indices is to sort the two-body also an attractive route to develop accurate machine learning
(2) (2) (2)
correlations such that ρi1 ≥ ρi2 ≥ · · · ρiNi . potentials.41–43 In non-linear regression techniques like Gaus-
sian process regression, the development of kernels that can
ii. Another descriptor that captures two-body correlations
effectively distinguish between two different atomic environ-
is motivated by the use of discrete diffusion kernels
ments has alsoreceived attention. Typically, Gaussian ker-
and tools from spectral graph theory for the analysis 2
of large data sets.34–36 We consider the Ni neighbors nels (i.e. exp − |Xi − Xj | /2σ 2 , where Xi and Xj are
5
the positions of two structures and σ is the width) are used correlation C (32) (ri , rj , rk ) tracks closed paths. To calculate
to measure the proximity between two different environments these correlations we consider the following overlap integral:
and construct the covariance matrix for Gaussian process re- Z ∞
−|r−ri |2 −|r−rj |2
gression based potential fitting.44 Duvenaud et al. have shown (31) 1
C (ri , rj , rk ) = 6 e 2σ e 2σ dr
2 2
that the ability to capture intrinsic structural features can be (2πσ 2 ) −∞
significantly improved by including correlations between in- Z ∞
−|r′ −ri |2 −|r′ −rk |2
trinsic degrees of freedom.45 Along this line, based on the × e 2σ2 e 2σ2 dr′
overlap integral between two configurations, Glielmo et al. −∞
1 1 √ √
proposed a kernel that includes the correlation between all the = ρ
(2)
2σ ρ
(2)
2σ
atoms in structures being compared.11 These authors further 64 (πσ 2 )3 ij ik
j
0 1 1 j
Kadj = 1 0 0 , (13) l l
1 0 0
√ √ i i
with eigenvalues of {− 2, 0, 2}. In general, the k × k ad-
jacency matrix of a star graph (with one node connected to k k
(k − 1) other nodes) has (k √− 2) zero eigenvalues and the two
non-zero eigenvalues are ± k − 1. Thus, the eigenvalues of
the symmetric matrix K(31) that embeds three-body correla- C(43) C(44)
tions are different from the adjacency matrix of a star graph.
C. Four-body correlations
(b)
To obtain descriptors derived from four-body correlations,
we consider four different contributions due to the overlap be- FIG. 4: Correlations due to the overlap between the atomic
tween local densities of three atoms in the neighbor-list of densities in a cluster containing 4 atoms.
atom i. These contributions are detailed below:
Geometrically, this is similar to a star graph as shown is a descriptor that incorporates the four-body correla-
in Fig. 4(a). To evaluate this four-body correlation, for tions in C (41) (ri , rj , rk , rl ).
each pair of atom (i, j) we define a (Ni − 1)×(Ni − 1)
(41)
square symmetric matrix K(41) , such that Kkl =
(2) √ (2) √
ρik 2σ ρil 2σ (1 − δkl ) (1 − δjl ) (1 − δjk ),
where k, l = 1, 2, 3, ..., (Ni − 1). Let
(41) (41) (41)
v(41) (k, l|i, j) = [λ1 λ2 · · · λ(Ni −1) ] be a vi. The correlation due to the presence of an atom k when
vector that contains the sorted eigenvalues of K(41) . densities of three other atoms i, j and l already have a
7
Following (vi) and (vii), we defined a vector u(43) that FIG. 5: Correlations due to the overlap between the atomic
captures this correlation. densities in a cluster containing 5 atoms.
atom with four neighbors some of which are shown in Fig. 5. dWX,X′ which compares Weyl matrices of these two structures.
However, for the examples studied in Section III, correlations The Weyl matrix, Σ, used here is invariant to global transla-
that arise due to two-, three- and four-body density overlaps tion, rotation and reflection, and takes the following form:
are sufficient to capture intrinsic details of the local geometry.
r1 · r1 r1 · r2 r1 · r3 · · · r1 · rN
Thus, the contributions of five-body or higher order correla-
tions are negligible for the systems considered in this study. r2 · r1 r2 · r2 r2 · r3 · · · r2 · rN
ΣX = r3 · r1 r3 · r2 r3 · r3 · · · r3 · rN (22)
···
III. NUMERICAL RESULTS: RECONSTRUCTION OF rN · r1 rN · r2 rN · r3 · · · rN · rN
SMALL CLUSTERS
The metric dW
X,X′ which measures the distance between two
A. Simulation setup and computational details Weyl matrices was calculated from
1/8
The ability of the many-body correlation descriptors to XN X N 8
capture intrinsic geometric features of local atomic environ- dWX,X′ =
(ΣX )ij − (ΣX′ )ij . (23)
ments was tested by performing reconstruction simulations. i=1 j=1
To this end, we used 8 clusters containing {10, 14, 20, 25}
Ge atoms and {10, 15, 20, 25} Cu atoms. These structures Reconstruction simulations were terminated when dW X,X′ <
were generated from bulk supercells containing 64 Ge and 1 × 10−3 Å2 . The quality of reconstructed structures was
256 Cu atoms by multiple surface cuts. For reconstruction also monitored using the root mean squared deviation of each
simulations, atomic positions in these clusters were perturbed atomic degree of freedom from the target, X:
by using random numbers uniformly distributed in the
1/2
interval (-1, 1) Å and sets containing 200 randomly per-
N X 3 2
turbed structures were generated. These perturbed structures 1
rj − rijX′ .
X
dE
X,X′ =
(24)
were then used as starting configurations for reconstruction 3N i=1 j=1 iX
simulations. The goal was to generate the unperturbed
configuration (denoted by X) using the set of vectors u = For clusters containing less than 20 atoms, reconstructions
{ u(1) , u(2) , u(31) , u(32) , u(41) , u(42) , u(43) , u(44) , u(51) }. simulations typically required 15000-20000 function evalua-
We minimized the cost function, Ω, to recover the target tions to converge. On the other hand, structures containing 20
structures, where and 25 Cu or Ge atoms typically required 40000-50000 func-
′ tion evaluations to converge.
Ω (X, X )
N nh m i h m i
(1) (1) (2) (2)
X
= 1 − ûiX , ûiX′ + 1 − ûiX , ûiX′
i=1 B. Results
h m i h m i
(31) (31) (32) (32)
+ 1− ûiX , + 1−
ûiX′ ûiX , ûiX′
h m i o Figure 6 shows the effect of two-body, three-body and four-
(41) (41)
+ 1 − ûiX , ûiX′ + ··· . body correlations on the quality of reconstructed structures
containing 10 Cu atoms. The distribution of dE X,X′ in Fig.
(21)
6(a) lies between 0 and 0.3 Å (mean and standard deviations
Here, X is the reference or the target structure,
X′ is the re- are 0.118 and 0.057 Å, respectively) when only u(2) descrip-
(·) (·) (·) tor was used for the 200 reconstruction simulations. The
constructed structure, û = u / u , m = 8, N is the
(2) (2) (31) distributions of dEX,X′ in Figs. 6(b) and 6(c) correspond to re-
total number of atoms in X, X′ , the vectors uiX , uiX′ , uiX ,
(31) construction simulations performed using {u(2) , u(31) , u(32) }
uiX′ , etc. are descriptors that encode different correlations. and {u(2) , u(31) , u(32) , u(41) , u(42) , u(43) , u(44) }, respec-
The sum in the above equation is over all atoms in the two tively. Using both two-body and three-body correlations
structures, X and X′ , and (u, v) denotes a scalar product. The decreased the mean and standard deviations of dE X,X′ to
cost function was optimized using the Nelder-Mead simplex
search method proposed by Lagarias et al.47 as implemented 0.100 and 0.039 Å, respectively. Similarly, using two-body,
in Matlab. The width, σ, of the single particle density in Eq. 9 three-body and four-body correlations pushed the distribution
was set to 3, 3.5 Å (midway between the first and the second further left, i.e. the mean of dE X,X′ from all the 200 recon-
peaks of the pair correlation function) for Cu and Ge clusters, structed structures was 0.098 Å while the standard deviation
respectively. While the cost function can be further minimized remained unchanged. This is evident from the fact that the
(using cross-validation) with respect to this hyper-parameter, maximum value of dE X,X′ decreased from ∼0.30Å (when
we were able to obtain satisfactory results with these values only u(2) was used) to 0.20 Å when three-body and four-body
of σ. descriptors were also used (see Fig. 6(b) and 6(c)). This is
Following Ref. [9], reconstructed structures (denoted by also evident from Fig. 6(d) which compares the values of
X′ ) were compared to the target structure, X, using the metric dEX,X′ when reconstruction simulations were performed using
9
(a) (b)
0.2
[Angstrom] (2,3,4−body)
0.15
0.1
0.05
′
X
d X,
E
0
0 0.1
E 0.2 0.3 0.4
d X, ′
X
[Angstrom] (2−body)
(c) (d)
FIG. 6: The quality of reconstruction improved when descriptors corresponding to three-body and four-body correlations were
included. Shown here is the distribution of the metric dE
X,X′ for the 200 reconstructed structures containing 10 Cu atoms. The
results in Fig. 6(a) correspond to reconstructions performed using only u(2) , while results in Figs. 6(b) and 6(c) correspond to
reconstruction simulations performed using u = {u(2) , u(31) , u(32) }, u = {u(2) , u(31) , u(32) , u(41) , u(42) , u(43) , u(44) },
respectively. Fig. 6(d) compares the values of dEX,X′ when reconstruction simulations were performed using only u
(2)
to those
obtained using two, three and four-body correlations starting from the same set of configurations.
gument that two and three-body correlations are sufficient to tions reported in Section III B, for a cluster with N = 14 Ge
obtain high quality of reconstruction for Cu clusters. atoms, information about the neighborhood of each atom was
Finally, we verified that the spectra of only two-body and captured using (N − 1) = 13 eigenvalues of K(2) , K(31) and
three-body overlap matrices are sufficient to reconstruct clus- K(32) . We were, however, able to successfully reconstruct the
ters containing 20 and 25 Ge atoms. To this end, we used structures by removing as many as 40% of the small eigenval-
the descriptors u = {u(1) , u(2) , u(31) , u(32) }. Figures 9(b) ues of the correlation matrices. This suggests that structural
and 9(d) show the distribution of dE X,X′ of the final recon- reconstruction can be performed using only the top eigenval-
structed structures clusters containing 20 and 25 Ge atoms, ues of the two-body and three-body correlation descriptors.
respectively. The distribution of initial structures used for
these reconstructed are shown in the two-dimensional space of
dW E
X,X′ -dX,X′ in Figs. 9(a) and 9(c). The fact that the recon- C. Comparison with existing models
structed structures satisfy the convergence condition dW X,X′ <
−3 2 The results in Section III show that an accurate repre-
1 × 10 Å clearly shows that higher order correlations are
not required for this system. sentation of the local environment can be achieved by sys-
tematically incorporating higher order correlations. Machine
learned potentials within the Gaussian approximation poten-
IV. DISCUSSIONS tial (GAP) framework incorporated two and three-body terms
using the power-spectrum and bi-spectrum descriptors. For
A. Uniqueness of reconstruction
example, the local environment of an atom i in GAP is char-
acterized by the atomic density defined as
It is known that the spectrum of the adjacency matrix of a 2 2
X
ρi (r) = e−|r−rij | /2σ
graph is not unique,48 meaning that the spectrum of the adja- j
cency matrix cannot be used to uniquely reconstruct the orig- X (26)
inal graph from a perturbed one.48 However, we were able = cnlm gn (|r|) Ylm (r)
to successfully reconstruct structures of clusters because we n,l,m
used descriptors for each atom in the cluster. In addition, P ∗
we note here that neighborhood information of atoms in these and the descriptor vector is given by q = m cnlm cn lm .
′
clusters had considerable overlap. For example, for the clus- Since, gn and Ylm are orthogonal basis sets, the
ter shown in Fig. 2, the central atom product c∗nlm cn′ lm includes two-body correlations like
√ (shown in blue) has eight gn (|rij |) gn′ (|rik |), where j and k are indices of two neigh-
neighbors within a distance of 3a/2, where a is the length
of the cube. The atom marked 1 is a neighbor of the central bors of atom i. Similarly, bi-spectrum components include
atom and its nearest neighbors are the central atom and atoms three-body correlations like gn (|rij |) gn′ (|rik |) gn (|ril |). In
marked 2, 4 and 5 (distance from atom 1 is a). Thus, our this study, instead of the sum of all many-body correlations,
reconstruction results suggest that spectra of adjacency matri- different two-body, three-body and four-body correlation in-
ces of such overlapping sub-graphs of a graph can be used to formation were used to obtain correlation matrices. The spec-
uniquely reconstruct the graph. tra of such correlation matrices were shown to embed struc-
tural information about the local neighborhood of an atom.
The spectral neighbor analysis potential (SNAP)7 frame-
work, is another MLP in which the energy of an atom i is
B. Reconstruction with a truncated spectrum
expressed as
The many-body correlation descriptors presented here can
X
Ei = β 0 + βp Bpi , (27)
capture the overlap between the densities of neighboring p
atoms, yet by incorporating local proximity information be-
tween different atoms (or nodes when the atoms are consid- where Bpi are the bi-spectrum components and βp are the co-
ered to be located at the vertices of a regular connected graph), efficients of this expansion. Recently, it was shown that the
they provide a global description of a high dimensional sys- accuracy of this MLP can be increased (an order of magni-
tem. Such a procedure is routinely used to find meaningful tude decrease in training error was reported) by incorporating
geometric description in clustering and dimension reduction a second order term,8 i.e.
techniques. In this context, the top few eigenvalues of the ad- X X
jacency matrix of a graph embed important information about Ei = β 0 + βp Bpi + αpq Bpi Bqi . (28)
the degree and connectivity of the graph and such properties p p,q
are relevant for reconstruction of clusters. Thus, it is intu-
itive to ask if all the eigenvalues are required for successful Here αpq are the coefficients. This improvement in accuracy
structural reconstruction simulations. To investigate this, we due to the presence of a second order term is in agreement
used two clusters containing 14 and 25 Ge atoms. Two sets, with our conclusion that incorporating higher order correla-
each containing 100 structures, were obtained by perturbing tions results in better learning of the local neighborhood.
the atomic positions in these clusters. For the reconstruc- Another machine learning based interatomic potential that
12
(a) (b)
(c) (d)
FIG. 8: Quality of reconstruction, when using only u(2) , improved when the system size was increased. Shown here is the
distribution of the metric dE
X,X′ for the 200 reconstructed structures containing 15 (Figs. 8(a), 8(b)), 20 (Fig. 8(c)) and 25
(Fig. 8(d)) Cu atoms. A systematic improvement in reconstruction of clusters containing 15 Cu atoms was also observed when
both two-body and three-body density correlations were used.
incorporates local correlations was proposed by Lindsey, et correlations described by C (31) (ri , rj , rk ):
al.24,25 In this model, the energy of each atom was expanded X X
using a basis of Chebyshev functions: Ei = Eij + Ēijk + Eijk , where
j j,k
(30)
X X
Ei = Eij + Eijk , where XX
Ēijk = cpq Tp (sij ) Tq (sik ) .
j j,k
X p q
Eij = cp Tp (sij ) , and (29)
p Here, cpq are coefficients of expansion, and Eij , Eijk are the
XXX two and three-body terms described in Eq. 29. Similarly,
Eijk = cpqr Tr (sij ) Tq (sik ) Tr (sjk ) . higher order correlations can be incorporated by following
p q r
the strategy discussed in Section II C. In general, the Cheby-
shev basis functions can be replaced by any other basis func-
In the above summation, Tn (sij ) is a Chebyshev polyno-
tions, such as Bessel functions, Neumann functions, Morlet
mial (of order n) of the first kind, sij ∈ [−1, 1] is a trans-
wavelets, Slater or Gaussian type orbitals, that have been used
formed nearest neighbor distance, and cp , cpqr are coefficients
recently in the literature to generate MLP.5,6
of the two and three-body terms. While Eij incorporates two-
body correlations, Eijk incorporates correlations described by Machine learned potentials based on neural networks
C (32) (ri , rj , rk ). Based on our results (see Sections II B), we that have been proposed in the literature have used sym-
hypothesize that the accuracy and the transferability of this metry functions that included two-body and three-body
MLP can be improved by including a term that incorporates correlation.1,28–30,49 Even though such correlation functions
were not based on an expansion using a basis set, we hypoth-
esize that the accuracy and transferability can be enhanced
13
1.05
[Angstrom]
0.95
0.9
X
0.85
d EX,
0.8
0.75
50 100 150 200 250 300
dW
X, X
[Angstrom 2
]
(a) (b)
0.95
[Angstrom]
0.9
0.85
X
d EX,
0.8
0.75
50 100 150 200 250 300
dW
X, X
[Angstrom 2
]
(c) (d)
FIG. 9: Figs. 9(a) and 9(c) show the distribution of 100 initial structures, containing 20 and 25 Ge atoms, respectively,
obtained by perturbing the target structure (using uniformly random numbers in the range of [-1, 1]) in the two-dimensional
space of dW E E
X,X′ and dX,X′ . The values of dX,X′ for the final reconstructed structures are shown in Figs. 9(b) (clusters
containing 20 Ge atoms) and 9(d) (clusters containing 25 Ge atoms).
by incorporating higher order correlations similar to those de- important for successful reconstruction of the original struc-
scribed in Eqs. 28 and 30. ture from a perturbed one. These descriptors capture the col-
lective behavior of constituent atoms and contain important
geometric information (such as the degree and connectivity,
V. SUMMARY closed paths) about the distribution of neighbors of an atom.
The two-body descriptor, u(1) , is sensitive to small changes
We have developed a framework to systematically gener- in bond length and was found to increase the rate of conver-
ate descriptors based on many-body correlations that can ef- gence of reconstruction simulations. This improved rate of
fectively capture intrinsic geometric features of the local en- convergence when u(1) was included with both the two-body
vironment of an atom. These descriptors were obtained from and the three-body descriptors is a manifestation of inclusion
the eigenvalues of two-body, three-body, four-body and higher of local as well as non-local (collective) structural informa-
order correlations between an atom and its neighbors. These tion.
many-body correlations correspond to two-body, three-body, We also found that both the three-body descriptors, i.e.
four-body, etc. overlap integrals which were solved analyt- u(31) and u(32) , are important. In general, the quality of re-
ically by representing the density of an isolated atom by a construction can be further improved by including four-body
Gaussian function. These descriptors are invariant to global correlations, but for the systems considered in this study this
translation, global rotation, reflection as well as permutations lead to only a marginal improvement in the quality of the re-
of atomic indices. constructed structures. Thus, higher order correlations are not
The ability of these descriptors to describe local environ- so important as the two and three-body correlations. But, this
ments was systematically tested by reconstructing structures framework allows us to systematically select the number of
of clusters containing 10 to 25 atoms. Our results suggest that descriptors (i.e. the type of correlations used) based on the
two and three-body correlations, i.e. u(2) , u(31) and u(32) , are trade off between accuracy, efficiency and computational cost.
14
The ability of these descriptors to capture intrinsic neigh- Tamm, ShinYoung Kang and Leora Dresselhaus-Cooper for
borhood information means that these descriptors can be used numerous stimulating discussions, and Albert Bartók for
to generate machine learned potentials and expedite first prin- sharing the details of reconstruction of Si-clusters using bi-
ciples molecular dynamics or Monte Carlo simulations within spectrum and power spectrum descriptors. This work was
the learn on the fly paradigm.10 performed under the auspices of the U.S. Department of
Energy by Lawrence Livermore National Laboratory under
Contract DE-AC52-07NA27344. Computing support for this
Acknowledgments work came from the Lawrence Livermore National Labora-
tory (LLNL) Institutional Computing Grand Challenge pro-
The author wishes to thank John Klepeis, Eric Schwe- gram. (IM release number: LLNL-JRNL-757813)
gler, Jonathan Belof, Edgar Landinez, Xavier Andrade, Arthur
∗
Electronic address: samanta1@llnl.gov ical Physics 144, 164101 (2016).
1 24
J. Behler and M. Parrinello, Physical Review Letters 98, 146401 L. Koziol, L. E. Fried, and N. Goldman, Journal of Chemical The-
(2007). ory and Computation 13, 135 (2017).
2 25
A. P. Bartók, M. C. Payne, R. Kondor, and G. Csányi, Physical R. K. Lindsey, L. E. Fried, and N. Goldman, Journal of Chemical
Review Letters 104, 136403 (2010). Theory and Computation 13, 6222 (2017).
3 26
M. Rupp, A. Tkatchenko, K.-R. Müller, and O. A. von Lilienfeld, L. Zhu, M. Amsler, T. Fuhrer, B. Schaefer, S. Faraji, S. Rostami,
Physical Review Letters 108, 058301 (2012). S. A. Ghasemi, A. Sadeghi, M. Grauzinyte, C. Wolverton, et al.,
4
K. Hansen, G. Montavon, F. Biegler, S. Fazli, M. Rupp, M. Schef- Journal of Chemical Physics 144, 034203 (2016).
27
fler, O. A. von Lilienfeld, A. Tkatchenko, and K.-R. Müller, Jour- A. V. Shapeev, Multiscale Modeling and Simulation 14, 1153
nal of Chemical Theory and Computation 9, 3404 (2013). (2016).
5 28
A. Seko, A. Takahashi, and I. Tanaka, Physical Review B 90, J. Behler, Physical Chemistry Chemical Physics 13, 17930 (2011).
29
024101 (2014). J. Behler, The Journal of Chemical Physics 134, 074106 (2011).
6 30
A. Seko, A. Takahashi, and I. Tanaka, Physical Review B 92, J. Behler, Journal of Physics: Condensed Matter 26, 183001
054113 (2015). (2014).
7 31
A. Thompson, L. Swiler, C. Trott, S. Foiles, and G. Tucker, Jour- M. Gastegger, L. Schwiedrzik, M. Bittermann, F. Berzsenyi, and
nal of Computational Physics 285, 316 (2015). P. Marquetand, Journal of Chemical Physics 148, 241709 (2018).
8 32
M. A. Wood and A. P. Thompson, Journal of Chemical Physics A. Takahashi, A. Seko, and I. Tanaka, Journal of Chemical
148, 241721 (2018). Physics 148, 234106 (2018).
9 33
A. P. Bartók, R. Kondor, and G. Csányi, Physical Review B 87, L. Zhang, J. Han, H. Wang, R. Car, and W. E, Physical Review
184115 (2013). Letters 120, 143001 (2018).
10 34
Z. Li, J. R. Kermode, and A. De Vita, Physical Review Letters R. Kondor and J. D. Lafferty, Proceedings of the International
114, 096405 (2015). Conference on Machine Learning p. 315 (2002).
11 35
A. Glielmo, P. Sollich, and A. De Vita, Physical Review B 95, M. Belkin and P. Niyogi, Neural Computation 15, 1373 (2003).
36
214302 (2017). R. R. Coifman, S. Lafon, A. B. Lee, M. Maggioni, B. Nadler,
12
A. Grisafi, D. M. Wilkins, G. Csányi, and M. Ceriotti, Physical F. Warner, and S. W. Zucker, Proceedings of the National
Review Letters 120, 036002 (2018). Academy of Sciences of the United States of America 102, 7426
13
S. J. Plimpton and A. P. Thompson, MRS Bulletin 37, 513 (2012). (2005).
14 37
V. Botu and R. Ramprasad, International Journal of Quantum A. E. Brouwer and W. H. Haemers, Spectra of graphs (Springer,
Chemistry 115, 1074 (2015). New York, NY, 2012), ISBN 978-1-4614-1938-9.
15 38
K. Hansen, F. Biegler, R. Ramakrishnan, W. Pronobis, O. A. von J. Shi and J. Malik, IEEE Transactions on Pattern Analysis and
Lilienfeld, K.-R. Müller, and A. Tkatchenko, The Journal of Phys- Machine Intelligence 22, 888 (2000).
39
ical Chemistry Letters 6, 2326 (2015). A. Ng, M. Jordan, and Y. Weiss, Advances in Neural Information
16
J. Cui and R. V. Krems, Journal of Physics B: Atomic, Molecular Processing Systems 14, 849 (2002).
40
and Optical Physics 49, 224001 (2016). B. Nadler, S. Lafon, R. Coifman, and I. Kevrekidis, Advances in
17
A. Kamath, R. A. Vargas-Hernńdez, R. V. Krems, T. C. Jr., and Neural Information Processing Systems 18, 955 (2006).
41
S. Manzhos, Journal of Chemical Physics 148, 241702 (2018). S. V. N. Vishwanathan, N. N. Schraudolph, R. Kondor, and
18
T. D. Huan, R. Batra, J. Chapman, S. Krishnan, L. Chen, and K. M. Borgwardt, Journal of Machine Learning Research 11, 1201
R. Ramprasad, npj Computational Mathematics 3, 37 (2017). (2010).
19 42
A. A. Peterson, R. Christensen, and A. Khorshidi, Physical Chem- N. Shervashidze, P. Schweitzer, E. J. van Leeuwen, K. Mehlhorn,
istry Chemical Physics 19, 10978 (2017). and K. M. Borgwardt, Journal of Machine Learning Research 12,
20
L. Bartok-Pártay, A. Bartók, and G. Csányi, 114, 10502 (2010). 2539 (2011).
21 43
T. Q. Yu, P. Y. Chen, M. Chen, A. Samanta, E. Vanden-Eijnden, G. Ferré, T. Haut, and K. Barros, Journal of Chemical Physics
and M. Tuckerman, Journal of Chemical Physics 140, 214109 146, 114107 (2017).
44
(2014). C. E. Rasmussen and C. K. I. Williams, Gaussian Processes for
22
A. Samanta, M. Chen, T. Q. Yu, M. Tuckerman, and W. E, Journal Machine Learning (2006).
45
of Chemical Physics 140, 164109 (2014). D. Duvenaud, H. Nickisch, and C. E. Rasmussen, Advances in
23
A. Samanta, M. A. Morales, and E. Schwegler, Journal of Chem- Neural Information Processing Systems 24, 226 (2011).
15
46 48
A. P. Bartók, R. Kondor, and G. Csányi, Physical Review B 96, M. Randić, Journal of Chemical Information and Computer Sci-
019902 (2017). ences 15, 105 (1975).
47 49
J. C. Lagarias, J. A. Reeds, M. H. Wright, and P. E. Wright, SIAM N. Artrith and J. Behler, Physical Review B 85, 045439 (2012).
Journal on Optimization 9, 112 (1998).