GNN-Foundations-Frontiers-and-Applications-chapter7
GNN-Foundations-Frontiers-and-Applications-chapter7
GNN-Foundations-Frontiers-and-Applications-chapter7
Deep learning has become an indispensable tool for a wide range of applications
such as image processing, natural language processing, and speech recognition. De-
spite the success, deep models have been criticized as “black boxes” due to their
complexity in processing information and making decisions. In this section, we in-
troduce the research background of interpretability in deep models, including the
Ninghao Liu
Department of CSE, Texas A&M University, e-mail: nhliu43@tamu.edu
Qizhang Feng
Department of CSE, Texas A&M University, e-mail: qf31@tamu.edu
Xia Hu
Department of CSE, Texas A&M University, e-mail: xiahu@tamu.edu
121
122 Ninghao Liu and Qizhang Feng and Xia Hu
Interpretation
Interpretation
Improvement
Fig. 7.1: Left: Interpretation could benefit user experiences in interaction with
models. Right: Through interpretation, we could identify model behaviors that are
not desirable according to humans, and work on improving the model accord-
ingly (Ribeiro et al, 2016).
There are several pragmatic reasons that motivate people to study and improve
model interpretability. Depending on who finally benefits from interpretation, we
divide the reasons into model-oriented and user-oriented, as shown in Fig. 7.1.
are based on sensitive features that are required to be avoided in real applica-
tions.
3. Adversarial-Attack Robustness: Adversarial attack refers to adding carefully-
crafted perturbations to input, where the perturbations are almost imperceptible
to humans, but can cause the model to make wrong predictions (Goodfellow
et al, 2015). Robustness against adversarial attacks is an increasingly impor-
tant topic in machine learning security. Recent studies have shown how inter-
pretation could help in discovering new attack schemes and designing defense
strategies (Liu et al, 2020d).
4. Backdoor-Attack Robustness: Backdoor attack refers to injecting malicious
functionality into a model, by either implanting additional modules or poison-
ing training data. The model will behave normally unless it is fed with input
containing patterns that trigger the malicious functionality. Studying model ro-
bustness against backdoor attacks is attracting more interest recently. Recent
research discovers that interpretation could be applied in identifying if a model
has been infected by backdoors (Huang et al, 2019c; Tang et al, 2020a).
feature 2 f’
1 3 1 3
f
x0 5 5
2 4 2 4
object color
object shape
background color
ground color
……
(d) (e)
The post-hoc interpretation has received a lot of interests in both research and real
applications. Flexibility is one of the advantages of post-hoc interpretation, as it put
less requirement on the model types or structures. In the following paragraphs, we
briefly introduce several commonly used methods. The illustration of the basic idea
behind each of these methods is shown in Fig. 7.2.
The first type of methods to be introduced is approximation-based methods.
Given a function f that is complex to understand and an input instance x⇤ 2 Rm , we
could approximate f with a simple and understandable surrogate function h (usually
chosen as a linear function) locally around x⇤ . Here m is the number of features in
each instance. There are several ways to build h. A straightforward way is based on
the first-order Taylor expansion, where:
where w 2 Rm tells how sensitive the output is to the input features. Typically, w
can be estimated with the gradient (Simonyan et al, 2013), so that w = —x f (x⇤ ).
When gradient information is not available, such as in tree-based models, we could
126 Ninghao Liu and Qizhang Feng and Xia Hu
build h through training (Ribeiro et al, 2016). The general idea is that a number of
training instances (xi , f (xi )), 1 i n are sampled around x⇤ , i.e., kxi x⇤ k e.
The instances are then used to train h, so that h approximates f around x⇤ .
Besides directly studying the sensitivity between input and output, there is an-
other type of method called layer-wise relevance propagation (LRP) (Bach et al,
2015). Specifically, LRP redistributes the activation score of output neuron to its
predecessor neurons, which iterates until reaching the input neurons. The redistri-
bution of scores is based on the connection weights between neurons in adjacent
layers. The share received by each input neuron is used as its contribution to the
output.
Another way to understand the importance of a feature xi is to answer questions
like “What would have happened to f , had xi not existed in input?”. If xi is important
for predicting f (x), then removing/weakening it will cause a significant drop in
prediction confidence. This type of method is called the perturbation method (Fong
and Vedaldi, 2017). One of the key challenges in designing perturbation methods is
how to guarantee the input after perturbation is still valid. For example, it is argued
that perturbation on word embedding vectors cannot explain deep language models,
because texts are discrete symbols, and it is hard to identify the meaning of perturbed
embeddings.
Different from the previous methods that focus on explaining prediction results,
there is another type of method that tries to understand how data is represented in-
side a model. We call it representation interpretation. There is no unified definition
for representation interpretation. The design of methods under this category is usu-
ally motivated by the nature of the problem or the properties of data. For example,
in natural language processing, it has been shown that a word embedding could be
understood as the composition of a number of basis word embeddings, where the
basis words constitute a dictionary (Mathew et al, 2020).
Besides understanding predictions and data representations, another interpreta-
tion scheme is to understand the role of model components. A well-known example
is to visualize the visual patterns that maximally activate the target neuron/layer in
a CNN model (Olah et al, 2018). In this way, we understand what kind of visual
signal is detected by the target component. The interpretation is usually obtained
through a generative process, so that the result is understandable to humans.
of the complex model. The pool of interpretable models includes linear models,
decision trees, rule-based models, etc. This strategy is also called mimic learning.
The interpretable model trained in this way tends to perform better than normal
training, and it is also much easier to understand than the complex model.
Attention models, originally introduced for machine translation tasks, have now
become enormously popular, partially due to their interpretation properties. The in-
tuition behind attention models can be explained using human biological systems,
where we tend to selectively focus on some parts of the input, while ignoring other
irrelevant parts (Xu et al, 2015). By examining attention scores, we could know
which features in the input have been used for making the prediction. This is also
similar to using post-hoc interpretation algorithms that find which input features are
important. The major difference is that attention scores are generated during model
prediction, while post-hoc interpretation is performed after prediction.
Deep models heavily rely on learning effective representations to compress in-
formation for downstream tasks. However, it is hard for humans to understand the
representations as the meanings of different dimensions are unknown. To tackle this
challenge, disentangled representation learning has been proposed. Disentangled
representation learning breaks down features of different meanings and encodes
them as separate dimensions in representations. As a result, we could check each
dimension to understand which factors of input data are encoded. For example, af-
ter learning disentangled representations on 3D-chair images, factors such as chair
leg style, width and azimuth, are separately encoded into different dimensions (Hig-
gins et al, 2017).
Despite the major progress made in domains such as vision, language and control,
many defining characteristics of human intelligence remain out of reach for tradi-
tional deep models such as convolutional neural networks (CNNs), recurrent neural
networks (RNNs) and multi-layer perceptrons (MLPs). To look for new model ar-
chitectures, people believe that GNN architectures could lay the foundation for more
interpretable patterns of reasoning (Battaglia et al, 2018). In this part, we discuss the
advantages of GNNs and challenges to be tackled in terms of interpretability.
The GNN architecture is regarded as more interpretable because it facilitates
learning about entities, relations, and rules for composing them. First, entities are
discrete and usually represent high-level concepts or knowledge items, so it is re-
garded as easier for humans to understand than image pixels (tiny granularity) or
word embeddings (latent space vectors). Second, GNN inference propagates infor-
mation through links, so it is easier to find the explicit reasoning path or subgraph
that contributes to the prediction result. Therefore, there is a recent trend of trans-
forming images or text data into graphs, and then applying GNN models for predic-
tions. For example, to build a graph from an image, we can treat objects inside the
image (or different portions within an object) as nodes, and generate links based on
128 Ninghao Liu and Qizhang Feng and Xia Hu
the spatial relations between nodes. Similarly, a document can be transformed into a
graph by discovering concepts (e.g., nouns, named entities) as nodes and extracting
their relations as links through lexical parsing.
Although the graph data format lays a foundation for interpretable modeling,
there are still several challenges that undermine GNN interpretability. First, GNN
still maps nodes and links into embeddings. Therefore, similar to traditional deep
models, GNN also suffers from the opacity of information processing in intermedi-
ate layers. Second, different information propagation paths or subgraphs contribute
differently to the final prediction. GNN does not directly provide the most impor-
tant reasoning paths for its prediction, so post-hoc interpretation methods are still
needed. In the following sections, we will introduce the recent advances in tackling
the above challenges to improve the explainability and interpretability of GNNs.
7.2.1 Background
Before introducing the techniques, we first provide the definition of graphs and re-
view the fundamental formulations of a GNN model.
Graphs: In the rest of the chapter, if not specified, the graphs we discuss are
limited to homogeneous graphs.
Definition 7.3. A homogeneous graph is defined as G = (V , E ), where V is the set
of nodes and E is the set of edges between nodes.
Furthermore, let A 2 Rn⇥n be the adjacency matrix of G , where n = |V |. For un-
weighted graphs, Ai, j is binary, where Ai, j = 1 means there exists an edge (i, j) 2 E ,
otherwise Ai, j = 0. For weighted graphs, each edge (i, j) is assigned a weight wi, j ,
so Ai, j = wi, j . In some cases, nodes are associated with features, which could be
denoted as X 2 Rn⇥m , and Xi,: is the feature vector of node i. The number of fea-
tures for each node is m. In this chapter, unless otherwise stated, we focus on GNN
models on homogeneous graphs.
GNN Fundamentals: Traditional GNNs propagate information via the input
graph’s structure according to the propagation scheme:
1 1
H l+1 = s (D̃ 2 ÃD̃ 2 H l W l ), (7.2)
7 Interpretability in Graph Neural Networks 129
Important Edge
Important Node
Important Feature
Computation graph of node 𝑖
(2 convolution layers)
Fig. 7.3: Illustration of explanation result formats. Explanation results for graph
neural networks could be the important nodes, the important edges, the important
features, etc. An explanation method may return multiple types of results.
where H l denotes the embedding matrix at layer l, and W l denotes the trainable
parameters at layer l. Also, Ã = A + I denotes the adjacency matrix of the graph
after adding the self-loop. The matrix D̃ is the diagonal degree matrix of Ã, i.e.,
1 1
D̃i,i = Â j Ãi, j . Therefore, D̃ 2 ÃD̃ 2 normalizes the adjacency matrix. If we only
focus on the embedding update of node i, the GCN propagation scheme could be
rewritten as:
1 l l
l+1
Hi,: = s( Â H j,:W ), (7.3)
c
j2V [{i} i, j
i
where H j,: denotes the j-th row of matrix H, and Vi denotes the neighbors of node
1 1
i. Here ci, j is a normalization constant, and c1i, j = (D̃ 2 ÃD̃ 2 )i, j . Therefore, the
embedding of node i at layer l can be seen as aggregating neighbor embeddings
of nodes that are neighbors of node i, followed by some transformations. The em-
beddings in the first layer H 0 is usually set as the node features. As the layer goes
deeper, the computation of each node’s embedding will include further nodes. For
example, in a 2-layer GNN, computing the embedding of node i will use the infor-
mation of nodes within the 2-hop neighborhood of node i. The subgraph composed
by these nodes is called the computation graph of node i, as shown in Fig. 7.3.
Target Models: There are two common tasks in graph analysis, i.e., graph-level
predictions and node-level predictions. We use classification tasks as the example. In
graph-level tasks, the model f (G ) 2 RC produces a single prediction for the whole
graph, where C is the number of classes. The prediction score for class c could
be written as f c (G ). In node-level tasks, the model f (G ) 2 Rn⇥C returns a matrix,
where each row is the prediction for a node. Some explanation methods are designed
solely for graph-level tasks, some are for node-level tasks, while some could handle
both scenarios. The computation graphs introduced above are commonly used in
explaining node-level predictions.
130 Ninghao Liu and Qizhang Feng and Xia Hu
𝑓𝑐 𝑓𝑐
𝑆(𝑥)
Grad⊙Input
Raw Gradient
(SA)
0 𝑥 0 𝑥
𝑓𝑐 𝑓𝑐
SmoothGrad IG
= 𝑆(𝑥)
0 𝑥 0 𝑥
The approximation-based explanation has been widely used to analyze the predic-
tion of models with complex structures. Approximation-based approaches could be
further divided into white-box approximation and black-box approximation. The
white-box approximation uses information inside the model, which includes but is
not limited to gradients, intermediate features, model parameters, etc. The black-box
approximation does not utilize information propagation inside the model. It usually
uses a simple and interpretable model to fit the target model’s decision on an input
instance. Then, the explanation can be easily extracted from the simple model. The
details of commonly used methods for both categories are introduced as below.
Sensitivity Analysis (SA) Baldassarre and Azizpour (2019) study the impact of a
particular change in an independent variable on a dependent variable. In the context
of explanation, the dependent variable refers to the prediction, while the independent
7 Interpretability in Graph Neural Networks 131
variables refer to the features. The local gradient of the model is commonly used as
sensitivity scores to represent the correlation between the feature and the prediction
result. The sensitivity score is defined as:
S (x) = —>
x f (G ) x. (7.5)
Therefore, Grad Input considers not only the feature sensitivity, but also the scale
of feature values. However, the methods mentioned above all suffered from the sat-
uration problem, where the scope of the local gradients is too limited to reflect the
overall contribution of each feature.
Integrated Gradients (IG) Sanchez-Lengeling et al (2020) solve the saturation
problem by aggregating feature contribution along a designed path in input space.
This path starts from a chosen baseline point G 0 and ends at the target input G .
Specifically, the feature contribution is computed as:
Z 1
S (x) = x x0 —x f G 0 + a G G0 da (7.6)
a=0
where x0 denotes a feature vector in the baseline point G 0 , while x is a feature vector
in the original input G . The choice of baseline G 0 is relatively flexible. A typical
strategy is to use a null graph as the baseline, which has the same topology but its
nodes use “unspecified” categorical features. This is motivated by the application of
132 Ninghao Liu and Qizhang Feng and Xia Hu
CAM treats each dimension of the final node embeddings (i.e., H:,k L ) as a feature
f c (G ) = Â wck hk (7.7)
k
where hk denotes the k-th entry of h, wck is the GAP-layer weight of k-th feature map
with respect to class c. Therefore, the contribution of node i to the prediction is:
1
nÂ
S (i) = wck Hi,k
L
. (7.8)
k
Although CAM is simple and efficient, it only works on models with certain struc-
tures, which greatly limits its application scenarios.
Grad-CAM (Pope et al, 2019) combines gradient information with feature maps
to relax the limitation of CAM. While CAM uses the GAP layer to estimate the
weight of each feature map, Grad-CAM employs the gradient of output with respect
to the feature maps to compute the weights, so that:
1 n ∂ f c (G )
wck = Â ∂ HL ,
n i=1
(7.9)
i,k
!
S (i) = ReLU Â wck Hi,k
L
. (7.10)
k
The ReLU function forces the explanation to focus on the positive influence on the
class of interest. Grad-CAM is equivalent to CAM for GNNs with only one fully-
connected layer before output. Compared to CAM, Grad-CAM can be applied to
7 Interpretability in Graph Neural Networks 133
more GNN architectures, thus avoiding the trade-off between model explainability
and capacity.
mask entry is in [0, 1], so it is a soft mask. There are two loss terms for training the
mask: (1) f 0 (Gt M) is close to f 0 (Gt ), (2) the mask M is sparse. The resultant mask
entry values indicate the importance score of edges in Gt , where a higher mask value
means the corresponding edge is more important.
PGM-Explainer applies probabilistic graphical models to explain GNNs. To
find the neighbor instances of the target, PGM-Explainer first randomly selects
nodes to be perturbed from computation graphs. Then, the selected nodes’ features
are set to the mean value among all nodes. After that, PGM-Explainer employs a
pair-wise dependence test to filter out unimportant samples, aiming at reducing the
computational complexity. Finally, a Bayesian network is introduced to fit the pre-
dictions of chosen samples. Therefore, the advantage of PGM-Explainer is that it
illustrates the dependency between features.
Relevance propagation redistributes the activation score of output neuron to its pre-
decessor neurons, iterating until reaching the input neurons. The core of relevance
propagation methods is about defining a rule for the activation redistribution be-
tween neurons. Relevance propagation has been widely used to explain models in
domains such as computer vision and natural language processing. Recently, some
work has been proposed to explore the possibility of revising relevance propagation
method for GNNs. Some representative approaches include LRP (Layer-wise Rel-
evance Propagation) (Baldassarre and Azizpour, 2019; Schwarzenberg et al, 2019),
GNN-LRP (Schnake et al, 2020), ExcitationBP (Pope et al, 2019).
LRP is first proposed in (Bach et al, 2015) to calculate the contribution of indi-
vidual pixels to the prediction result for an image classifier. The core idea of LRP is
to use back propagation to recursively propagate the relevance scores of high-level
neurons to low-level neurons, up to the input-level feature neurons. The relevance
score of the output neuron is set as the prediction score. The relevance score that
a neuron receives is proportional to its activation value, which follows the intu-
ition that neurons with higher activation tend to contribute more to the prediction.
In (Baldassarre and Azizpour, 2019; Schwarzenberg et al, 2019), the propagation
rule is defined as below:
z+
=Â
i, j
Rli Rl+1
j
k, j + b j + e
Âk z+ +
j (7.12)
zi, j = xil wi, j
limited to nodes and node features, where graph edges are excluded. The reason
is that the adjacency matrix is treated as part of the GNN model. Therefore, LRP
is unable to analyze topological information which nevertheless plays an important
role in graph data.
ExcitationBP is a top-down attention model originally developed for CNNs
(Zhang et al, 2018d). It shares a similar idea as LRP. However, ExcitationBP defines
the relevance score as a probability distribution and uses a conditional probability
model to describe the relevance propagation rule.
P (a j ) = Â P (a j | ai ) P (ai ) (7.13)
i
where a j is the j-th neuron in the lower layer and ai is the i-th parent neuron of
a j in the higher layer. When the propagation process passes through the activation
function, only non-negative weights are considered and negative weights are set to
zero. To extend ExcitationBP for graph data, new backward propagation schemes
are designed for the softmax classifier, the GAP (global average pooling) layer and
the graph convolutional operator.
GNN-LRP mitigates the weakness of traditional LRP by defining a new prop-
agation rule. Instead of using the adjacency matrix to obtain propagation paths,
GNN-LRP assigns the relevance score to a walk, which refers to a message flow
path in the graph. The relevance score is defined by the T -order Taylor expansion of
the model with respect to the incorporation operator (graph convolutional operator,
linear message function, etc.). The intuition is that the incorporation operator with
greater gradients has a greater influence on the final decision.
where GS and XS is the subgraph and its nodes’ features. Y is the predicted label
distribution, and its entropy H(Y ) is a constant. To solve the optimization problem
above, the authors apply a soft-mask M on adjacency matrix:
C
min
M
 1[y = c] log PF (Y = y | G = Ac s (M), X = Xc ) , (7.15)
c=1
where Y denotes the trainable parameters of the MLP. zi and z j are the embedding
vector for node i and j, respectively. [·; ·] denotes concatenation. Similar to the GN-
NExplainer, the mask generator is trained by maximizing the mutual information
between the original prediction and the new prediction.
GraphMask (Schlichtkrull et al, 2021) also produces the explanation by estimat-
ing the influences of edges. Similar to PGExplainer, GraphMask learns an erasure
function that quantifies the importance of each edge. The erasure function is defined
as: ⇣ ⌘
(k) (k) (k) (k)
zu,v = gp hu , hv , mu,v (7.17)
where hu , hv and mu,v refers to the hidden embedding vectors for node u, node v and
the message sent through the edge in graph convolution. p denotes the parameters
of function g. One difference between GraphMask and PGExplainer is that the for-
mer also takes the edge embedding as input. Another difference is that GraphMask
provides the importance estimation for every graph convolution layer, and k indi-
cates the layer that the embedding vectors belong to. Instead of directly erasing the
influences of unimportant edges, the authors then propose to replace the message
sent through unimportant edges as:
⇣ ⌘
(k) (k) (k) (k)
m̃u,v = zu,v · mu,v + 1 zu,v · b(k) , (7.18)
7 Interpretability in Graph Neural Networks 137
where b(k) is trainable. The work shows that a large proportion of edges can be
dropped without deteriorating the model performance.
Causal Screening (Wang et al, 2021) is a model-agnostic post-hoc method that
identifies a subgraph of input as an explanation from the cause-effect standpoint.
Causal Screening exerts causal effect of candidate subgraph as the metric:
where Gk is the candidate subgraph, k is the number of edges and MI is the mu-
tual information. The intervention do(G = Gk ) and do(G = 0) / means the model
input receives treatment (feeding Gk into the model) and control (feeding 0/ into the
model), respectively. ŷ denotes the prediction when feeding the original graph into
the model. Causal Screening uses a greedy algorithm to search for the explanation.
Starting from an empty set, at each step, it adds one edge with the highest causal
effect into the candidate subgraph.
CF-GNNExplainer (Lucic et al, 2021) also proposes to generate counterfactual
explanations for GNNs. Different from previous methods that try to find a sparse
subgraph to preserve the correct prediction, CF-GNNExplainer proposes to find the
minimal number edges to be removed such that the prediction changes. Similar to
GNNExplainer, CF-GNNExplainer employs the soft mask as well. Therefore, it also
suffers from the “introduced evidence” problem (Dabkowski and Gal, 2017), which
means that non-zero or non-one values may introduce unnecessary information or
noises, and thus influence the explanation result.
then take the learned features as input to predict the possibility of a start point and
an endpoint. The endpoint and the edge between the two points are added to update
the intermediate graph as an action. Finally, it calculates the reward of the action, so
that we can train the generator via policy gradient algorithms. The reward consists
of two terms. The first term is the score of the intermediate graph after feeding it to
the target GNN model. The second one is a regularization term that guarantees the
validity of the intermediate graph. The above steps are executed repeatedly until the
number of action steps reaches the predefined upper limit. As a generative explana-
tion method, XGNN provides a holistic explanation for graph classification. There
could be more generative explanation methods for other graph analysis tasks to be
explored in the future.
where ai, j is the attention score, and Vi denotes the set of neighbors of node i. Also,
GAT uses a shared parameter matrix W independent of the layer depth. The attention
score is computed as:
exp(ei, j )
ai, j = softmax(ei, j ) = , (7.21)
Âk2Vi [{i} exp(ei,k )
7 Interpretability in Graph Neural Networks 139
𝑗
𝒉𝑙 𝒂
𝒉4𝑙 𝒉5𝑙
𝑾𝒉𝑖𝑙 𝑾𝒉𝑙
𝑗
Fig. 7.5: Left: An illustration of graph convolution with single head attentions by
node 1 on its neighborhood. Middle: The linear transformation with a shared param-
eter matrix. Right: The attention mechanism employed in (Veličković et al, 2018).
where k denotes vector concatenation. In general, the attention mechanism can also
be denoted as ei, j = attn(hil , hlj ). Therefore, the attention mechanism is a single-
layer neural network parameterized by a weight vector a. The attention score ai, j
shows the importance of node j to node i.
The above mechanism could also be extended with multi-head attention. Specif-
ically, K independent attention mechanisms are executed in parallel, and the results
are concatenated:
hil+1 = kKk=1 s ( Â ai,k jW k hlj ), (7.23)
j2Vi [{i}
where ai,k jis the normalized attention score in the k-th attention mechanism, and W k
is the corresponding parameter matrix.
Besides learning node embeddings, we could also apply attention mechanisms to
learn a low-dimensional embedding for the whole graph (Ling et al, 2021). Suppose
we are working on an information retrieval problem. Given a set of graphs {Gm },
1 m M, and a query q, we want to return the graphs that are most relevant to the
query. The embedding of each graph Gm with respect to q could be computed using
the attention mechanism. In the first step, we could apply normal GNN propagation
rules as introduced in Equation 7.2, to obtain the embeddings of nodes inside each
graph. Let q denote the embedding of the query, and hi,m denote the embedding of
node i in a graph Gm . The embedding of graph Gm with respect to the query can be
computed as:
1 |Gm |
hqGm = Â ai,q hi,m
|Gm | i=1
(7.24)
where ai,q = attn(hi,m , q) is the attention score, and attn() is a certain attention func-
tion. Finally, hqGm can be used to compute the similarity of Gm to the query in the
graph retrieval task.
140 Ninghao Liu and Qizhang Feng and Xia Hu
A heterogeneous network is a network with multiple types of nodes, links, and even
attributes. The structural heterogeneity and rich semantic information bring chal-
lenges for designing graph neural networks to fuse information.
Definition 7.4. A heterogeneous graph is defined as G = (V , E , f , y), where V is
the set of node objects and E is the set of edges. Each node v 2 V is associated with
a node type f (v), and each edge (i, j) 2 E is associated with an edge type y((i, j)).
We introduce how the challenge in embedding could be tackled using Heteroge-
neous graph Attention Network (HAN) (Wang et al, 2019m). Different from tradi-
tional GNNs, information propagation on HAN is conducted based on meta-paths.
1 2 r r
Definition 7.5. A meta-path F is defined as a path with the form vi1 ! vi2 !
rl 1
··· ! vil , abbreviated as vi1 vi2 · · · vil with a composite relation r1 r2 · · · rl 1.
To learn the embedding of node i, we propagate the embeddings from its neighbors
within the meta-path. The set of neighbor nodes is denoted as Vi F . Considering
that different types of nodes have different feature spaces, a node embedding is first
0
projected to the same space h j = Mfi h j . Here Mfi is the transformation matrix for
node type fi . The attention mechanism in HAN is similar to GAT, except that we
need to consider the type of meta-path that is currently sampled. Specifically,
Â
0
zi,F = s ( ai,Fj h j ), (7.25)
j2Vi F
Given a set of meta-paths {F1 , ..., FP }, we can obtain a group of node embeddings
denoted as {zi,F1 , ..., zi,FP }. To fuse embeddings across different meta-paths, an-
other attention algorithm is applied. The fused embedding is computed as:
P
zi = Â bF p zi,F p , (7.27)
p=1
as a young father
Node: Person
is interested in
Fig. 7.6: Using multiple embeddings to represent the interests of a user. Each em-
bedding segment corresponds to one aspect in data (Liu et al, 2019a).
Clustering/Routing
Prediction Layer
Target node
Fig. 7.7: The high-level idea of learning the disentangled node embedding for a
target node by using clustering or dynamic routing.
where t is a hyper-parameter that scales the cosine similarity. Then, the probability
of observing an edge (u,t) is
K
p(t|u, ct ) µ Â ct,k · similarity(ht , hu,k ). (7.30)
k=1
Besides the fundamental learning process introduced above, the variational autoen-
coder framework could also be applied to regularize the learning process (Ma et al,
2019c). The item embeddings and prototype embeddings are jointly updated until
convergence. The embedding of each user hu is determined by aggregating the em-
beddings of interacted items, where hu,k collects embeddings from items that also
belong to facet k. In the learning process, the cluster discovery, node-cluster assign-
ments, and embedding learning are jointly conducted.
7 Interpretability in Graph Neural Networks 143
The idea of using dynamic routing for disentangled node representation learning is
motivated by the Capsule Network (Sabour et al, 2017). There are two layers of
capsules, i.e., low-level capsules and high-level capsules. Given a user u, the set of
items that he has interacted with is denoted as Vu . The set of low-level capsules
is {cli }, i 2 Vu , so each capsule is the embedding of an interacted item. The set of
high-level capsules is {chk }, 1 k K, where chk represents the user’s k-th interest.
The routing logit value bi,k between low-level capsule i and high-level capsule k
is computed as:
bi,k = (chk )> S cli , (7.31)
where S is the bilinear mapping matrix. Then, the intermediate embedding for high-
level capsule k is computed as a weighted sum of low-level capsules,
so wi,k can be seen as the attention weights connecting the two capsules. Finally, a
“squash” function is applied to obtain the embedding of high-level capsules:
kzhk k2 zhk
chk = squash(zhk ) = . (7.33)
1 + kzhk k2 kzhk k2
The above steps constitute one iteration of dynamic routing. The routing process is
usually repeated for several iterations to converge. When the routing finishes, the
high-level capsules can be used to represent the user u with multiple interests, to be
fed into subsequent network modules for inference (Li et al, 2019b), as shown in
Fig. 7.7.
In this section, we introduce the setting for evaluating GNN explanations. This in-
cludes the datasets that are commonly used for constructing and explaining GNNs,
as well as the metrics that evaluate different aspects of explanations.
As more approaches have been proposed for explaining GNNs, a variety of datasets
have been used to assess their effectiveness. As such a research direction is still
144 Ninghao Liu and Qizhang Feng and Xia Hu
1 N yi
f idelity = Â f (Gi )
N i=1
f yi Gi \ Gi0 (7.34)
where f is the output function target model. Gi is the i-th graph, Gi0 is the ex-
planation for it, and Gi \ Gi0 represents the perturbed i-th graph in which the
identified explanation is removed.
• Contrastivity (Pope et al, 2019) uses Hamming distance to measure the dif-
ferences between two explanations. These two explanations correspond to the
model’s prediction of one instance for different classes. It is assumed that mod-
els would highlight different features when making predictions for different
146 Ninghao Liu and Qizhang Feng and Xia Hu
classes. The higher the contrastivity, the better the performance of the inter-
preter.
• Sparsity (Pope et al, 2019) is calculated as the ratio of explanation graph size
to input graph size. In some cases, explanations are encouraged to be sparse,
because a good explanation should include only the essential features as far as
possible and discard the irrelevant ones.
• Stability (Sanchez-Lengeling et al, 2020) measures the performance gap of the
interpreter before and after adding noise to the explanation. It suggests that a
good explanation should be robust to slight changes in the input that do not
affect the model’s prediction.
Interpretation on graph neural networks is an emerging domain. There are still many
challenges to be tackled. In this section, we list several future directions towards
improving the interpretability of graph neural networks.
First, some online applications require real-time responses from models and al-
gorithms. It thus puts forward high requirements on the efficiency of explanation
methods. However, many GNN explanation methods conduct sampling or highly
iterative algorithms to obtain the results, which is time-consuming. Therefore, one
future research direction is how to develop more efficient explanation algorithms
without significantly sacrificing explanation precision.
Second, although more and more methods have been developed for interpreting
GNN models, how to utilize interpretation towards identifying GNN model defects
and improving model properties is still rarely discussed in existing work. Will GNN
models be largely affected by adversarial attacks or backdoor attacks? Can interpre-
tation help us to tackle these issues? How to improve GNN models if they have been
found to be biased or untrustworthy?
Third, besides attention methods and disentangled representation learning, are
there other modeling or training paradigms that could also improve GNN inter-
pretability? In the interpretable machine learning domain, some researchers are in-
terested in providing causal relations between variables, while some others prefer
using logic rules for reasoning. Therefore, how to bring causality into GNN learn-
ing, or how to use incorporate logic reasoning into GNN inference, may be an inter-
esting direction to explore.
Fourth, most existing efforts on interpretable machine learning have been de-
voted to get more accurate interpretation, while the human experience aspect is usu-
ally overlooked. For end-users, friendly interpretation can promote user experience,
and gain their trust to the system. For domain experts without machine learning
background, an intuitive interface helps integrate them into the system improvement
loop. Therefore, another possible direction is how to incorporate human-computer
interaction (HCI) to show explanation in a more user-friendly format, or how to de-
sign better human-computer interfaces to facilitate user interactions with the model.
7 Interpretability in Graph Neural Networks 147
Acknowledgements The work is, in part, supported by NSF (#IIS-1900990, #IIS-1718840, #IIS-
1750074). The views and conclusions contained in this paper are those of the authors and should
not be interpreted as representing any funding agencies.
Editor’s Notes: Similar to the general trend in the machine learning do-
main, explainability has been ever more widely recognized as an important
metric for graph neural networks in addition to those well recognized be-
fore such as effectiveness (Chapter 4), complexity (Chapter 5), efficiency
(Chapter 6), and robustness (Chapter 8). Explainability can not only broadly
influence technique development (e.g., Chapters 9-18) by informing model
developers of useful model details, but also could benefit domain experts in
various application domains (e.g., Chapters 19-27) by providing them with
explanations of predictions.