0% found this document useful (0 votes)
9 views

Detection - and - Classification of Line

Scholar of m.tech students

Uploaded by

Ikshore Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Detection - and - Classification of Line

Scholar of m.tech students

Uploaded by

Ikshore Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

456 CSEE JOURNAL OF POWER AND ENERGY SYSTEMS, VOL. 7, NO.

3, MAY 2021

Detection and Classification of Transmission Line


Transient Faults Based on Graph Convolutional
Neural Network
Houjie Tong , Robert C. Qiu, Fellow, IEEE, Dongxia Zhang, Haosen Yang, Qi Ding, and Xin Shi

Abstract—We present a novel transient fault detection and the transmission systems can the operator take effective emer-
classification approach in power transmission lines based on gency control actions according to the classification results,
graph convolutional neural network. Compared with the existing which facilitates the location of the faults and reduces the
techniques, the proposed approach considers explicit spatial
information in sampling sequences as prior knowledge and it has time of eliminating them.
stronger feature extraction ability. On this basis, a framework For transient fault detection and classification, the extraction
for transient fault detection and classification is created. Graph of fault features is a key task. Different from the fault
structure is generated to provide topology information to the identification based on image data [3], the feature extraction
task. Our approach takes the adjacency matrix of topology of voltage and current data involved in this paper is more
graph and the bus voltage signals during a sampling period after
transient faults as inputs, and outputs the predicted classification abstract. Early researches are based on transmission line fault
results rapidly. Furthermore, the proposed approach is tested mechanism model. In reference [4], the mechanism of fault
in various situations and its generalization ability is verified current generation and the fault features are analyzed by
by experimental results. The results show that the proposed establishing the expression of fault current. Reference [5] uses
approach can detect and classify transient faults more effectively the fault equivalent circuit to determine the fault current and
than the existing techniques, and it is practical for online
transmission line protection for its rapidness, high robustness threshold to classify the faults. The above researches derive the
and generalization ability. expression of fault current or voltage through fault mechanism
analysis, and finally make fault diagnosis. The above model-
Index Terms—Graph convolutional network (GCN), power driven techniques may achieve good results under specific
transmission line, fault detection and classification, spatio- scenarios, but poor generalizability is their drawback. The key
temporal data, topology information.
reason is that single model cannot fully depict the various
mechanisms involved in electrical events, and it will become
invalid in variable environments [6]. Moreover, these methods
usually require many assumptions, and the modeling process
I. I NTRODUCTION involves a lot of manual calculation which is time-consuming

T RANSIENT fault detection and classification of power


transmission are the basis of the analysis and treatment of
power accidents, which are of great significance for improving
and labor-intensive [7].
With the rise of artificial intelligence technology in the era
of industrial big data, the data-driven fault detection and classi-
the stability of power grid. With the growing scale of inter- fication method begins to show more remarkable performance.
connection and the development of operation under stressed There is an early work using support vector machine (SVM)
condition in modern power systems [1], [2], the features of to identify transmission line fault types under different fault
transient faults become more complex, which makes fault working conditions, and taking wavelet singular information as
detection and classification more urgent. Only by promptly characteristic parameters [8]. Reference [9] defines four multi-
and accurately determining the type of faults that occur in wavelet packet entropy to extract transmission line fault signals
and uses radial basis function neural network to achieve clas-
Manuscript received September 15, 2020; revised December 30, 2020;
accepted January 19, 2021. Date of online publication April 30, 2021; date of sification results. Then some scholars adopt decision tree algo-
current version May 7, 2021. This work was supported by the National Key rithm and K-nearest neighbor (KNN) algorithm respectively to
Research and Development Program of China under Grant 2018YFF0214704. identify transmission line fault types [10], [11]. In recent years,
H. J. Tong (corresponding author, email: thj 926@sjtu.edu.cn; ORCID:
https://orcid.org/0000-0002-9683-8208), R. C. Qiu, H. S. Yang, and Q. Ding the end-to-end neural network (NN) featuring self-learning
are with the Department of Electrical Engineering, Center for Big Data ability is introduced. In reference [12], sparse autoencoder
and Artificial Intelligence, Shanghai Jiao Tong University, Shanghai 200240, (SAE) is proposed to process voltage and current signals
China. R. C. Qiu is also with the School of Electronic Information and
Communication, Huazhong University of Science and Technology, Wuhan of transmission lines for fault classification. Reference [13]
430000, China. regards the voltage signal matrix as a grayscale image for
D. X. Zhang is with China Electric Power Research Institute, Haidian input, thereby using convolutional neural network (CNN) to
District, Beijing 100192, China.
X. Shi is with the School of Control and Computer Engineering, North realize fault classification. The concept of spatio-temporal
China Electric Power University, Beijing 102206, China. matrix is mentioned in some literature such as reference [14],
DOI: 10.17775/CSEEJPES.2020.04970 [15]. The authors utilize the random matrix theory (RMT) to
2096-0042 © 2020 CSEE
TONG et al.: DETECTION AND CLASSIFICATION OF TRANSMISSION LINE TRANSIENT FAULTS BASED ON GRAPH CONVOLUTIONAL NEURAL NETWORK 457

explore the spatio-temporal correlation of abnormal data in the problem statement and the proposed method framework.
reference [14], but the spatio-temporal matrix in it does not Section IV is the case study. Section V summarizes our work.
introduce the connection relationship between nodes. In other
words, the spatial relationship of data is not provided as prior II. G RAPH C ONVOLUTIONAL N EURAL N ETWORK
knowledge. In this section, we first introduce the graph structure and
As pointed out above, the evolution of fault detection and provide an example to illustrate the workflow of graph con-
classification methods is from model-driven to data-driven. volution.
With the power industry gradually becoming intelligent, the
more complex characteristics of transient faults and the multi- A. Graph Structure
variation of transmission system operation make data-driven In our work, we treat a transmission system as an undirected
fault detection and classification techniques still the leading graph G = (V, E, A) with N nodes vi ∈ V, edges (vi , vj ) ∈ E,
approaches at present [1], [16]. By reviewing, we find that an adjacency matrix A ∈ RN ×N (weighted) and a degree
even though many studies investigated new approaches to
P
matrix Dii = j Aij . The structure of undirected graph is
improve the effect of fault detection and classification, few depicted in Fig. 1.
researchers considered the explicit spatial relations among the
fault data of transmission systems. However, power system V6
data, as a typical “industrial big data”, is indeed a kind of e16
spatio-temporal data [17], [18]. Although the spatio-temporal
correlation of fault data is vital, it is difficult to introduce it e23
explicitly into the detection and classification task through the e12 V3
V1
existing techniques. Our work aims to fill this gap. Along with e15
the well-established research line of GCN, we come up with V2
V5 e24
more new ideas as we deal with the graph structure data.
Graph NN is first introduced by Bruna et al. [19]. It applies
convolutional layers on the graph structured data rather than V4
just regular data such as images. Compared with the case
that CNN cannot effectively process irregular data, researchers Fig. 1. Undirected graph structure with nodes vi and egde weights eij
could make effective use of explicit spatial information when (i, j = 1, 2, 3, 4, 5, 6).
using GCN. Due to the universality and diversity of graph
structure data in our life, the development and application of The adjacency matrix A represents the connection relation-
GCN are rising rapidly. It has been successfully applied in ships of all nodes in a graph, as follows:
recommendation system, social network, life science and other
 
0 e12 0 0 e15 e16
fields [20]. In our view, the power system topology is naturally e21 0 e23 e24 0 0 
a graph and the edge information in the graph can also be
 
 0 e32 0 0 0 0 
extended to “electrical distance”. The topological structure of A=  0 e42 0
 (1)
0 0 0 
power transmission system and large amounts of measurement
 
e51 0 0 0 0 0 
data provide a new opportunity for proper applications of e61 0 0 0 0 0 6×6
GCN in power systems. In our work, a GCN based on power
topology is used to detect and classify transient faults. where eij in the matrix represents the correlation between the
Briefly, this paper has the following contributions: ith node and the jth node. If two nodes are connected by an
1) To the best of our knowledge, this is the first work of edge, eij is equal to the weight coefficient of this edge; if not,
leveraging GCN to implement the transient fault detection and then eij = 0. It is worth noting that eij = eji in the undirected
classification task. graph. Besides, degree matrix D is diagonal and the value of
2) A drawback of the existing techniques is pointed that the the diagonal element equals the number of adjacent nodes of
effect of explicit spatial information has not been taken into the corresponding node.
account. Therefore, we provide a novel idea of embedding B. Workflow of Graph Convolution
the spatio-temporal relations between data into detection and
Graph convolution was originally derived based on graph
classification models. To be brief, we propose to regard the
theory and convolution theorem with the purpose of applying
transmission line topology as a graph and utilize topology
it to graph data processing [22]. Through constant refinement
parameters to construct graph elements.
and optimization of the model, the expression of GCN has
3) In addition, we introduce the receiver operating charac-
become more understandable.
teristic (ROC) curve instead of only using accuracy to charac-
In practical application, we utilize the commonly used GCN
terize the overall performance of the classifier [21]. Further,
for graph convolution operation, in the form of feature transfer
comparison studies are implemented from three aspects to
and aggregation through self-normalized adjacency matrix.
analyze the generalization ability of our proposed method and
This GCN is proposed by Kipf et al. [23], and its one layer
the existing machine learning techniques.
operation is as follows:
The rest of our paper is organized as follows. Section II
1 1
introduces graph convolutional network. Section III discusses Z = σ(D̃ − 2 ÃD̃ − 2 XW ) = σ(ÂXW ) (2)
458 CSEE JOURNAL OF POWER AND ENERGY SYSTEMS, VOL. 7, NO. 3, MAY 2021

where the resultant à = A + I is the adjacency matrix with accounting for more than 90% of the total faults. When
self-loop. Self-loop can maintain the information of the target transient fault occurs in transmission line, the nodal voltage
station itself in the convolution part,
P which is a required design will drop to different degrees. To show the characteristics
strategy in GCN. And D̃ii = j Ãij is the degree matrix of different faults, some tests are implemented on a small
1 1
of Ã, so  = D̃ − 2 ÃD̃ − 2 represents the self-normalized transmission system with few nodes. Fig. 3 show the changes
adjacency matrix. W is the trainable weight matrix and σ(.) of nodal voltage waveform before and after the occurrence of
is the activation function. various faults.
The workflow for graph convolution is depicted in Fig. 2. 1) Normal Condition
Firstly, taking the graph in Fig. 1 as an example, we assume In the normal condition of a transmission system, the nodal
that the feature of node vi is Xi = [xi1 . . . xin ]T , so voltage range usually stays around 1.0 p.u. Unlike the steady
X ∈ R6×n . Secondly, the function of multiplying  by state simulation data mentioned in other papers, slight voltage
X is to transfer and aggregate the features of the adjacent fluctuations may occur in the transient data under normal
nodes, as shown in the middle part of Fig. 2. Finally, Z is the operating conditions.
output of this GCN layer, on which all nodes contain first-order 2) Fault Condition
neighborhood information. It is easy to deduce that the output
As shown in Fig. 3, the voltage reduction amplitude of
neurons obtained through k GCN layers can express k-order
different nodes under different fault conditions is different.
neighborhood information (spatial information). Therefore, the
Furthermore, influences on different nodes under the same
hidden layer data of GCN can provide more prior information
fault are different due to different distances from the fault
for the model training, so that the trained hidden layer neurons
location.
have a deeper feature expression ability.
The final purpose of our work is to detect the occurrence of
faults and to determine which kind of fault occurs by learning
III. D ETECTION AND C LASSIFICATION OF T RANSIENT deep representations of system nodal voltages.
FAULTS BASED ON G RAPH C ONVOLUTIONAL
N EURAL N ETWORK B. Construction of Transmission System Graph Structure
A. Problem Statement In this section we elaborate on how to construct the graph
Transmission line transient faults can be divided into single- structure data in transmission system. And this proposed way
phase ground fault, two-phase short circuit fault, two-phase to construct graph can be generalized to other engineering
ground fault and three-phase short circuit fault. Common tasks involving topology structure in power system. Here we
causes of these faults are lightning strike, wind deviation, take the IEEE 9-bus system as an example to illustrate the
pollution flashover, icing, external force, bird damage and construction process of the graph which is shown in Fig. 4.
some internal faults of the system. The severity of the four First of all, we define the nodes and their features. Obvi-
main types of faults is obviously different. Three-phase short ously, nodes here are the bus nodes in the transmission system,
circuit fault is the most harmful fault in the power transmission whose features are bus voltages or other electrical data.
system and requires the shortest clearing time. Single-phase Secondly, edges are defined. The lines in the topology are
ground fault is not as harmful as other kinds of faults, but the edges of the graph, which means that if there is a line
should not be neglected due to its high occurrence frequency, between two nodes, an edge can be assumed to exist between

Y1 contains the features of


Self-loop node 1, 2, 5 and 6

X2 e^12
^
Z6 ęRm
X6 ęRn e15
X1 X5 Y1
e^16
e16
X6 WęRn×m
X3 Z1 Z3
X1 e23 ^ ^
AX Z=σ(A XW)=σ(YW)
e12
Z2
e15 X2
X5 node features one GCN layer Z5
e24 aggregation Self-loop
X1 Z1
X2 e^16 Z2 z11 Ă z1m Z4
x11 Ă x1n X4 X6 X1 Y6
X Z
output: Z = 3 =
Ă
Ă

input: X = 3 =
Ă
Ă

, and A Self-loop Z4
X4 z61 Ă z6m
X5 x61 Ă x6n Z5 6×m
6×n
spatial information Z6
X6

Fig. 2. Workflow of graph convolution (left part: input nodal features and edge weights; middle part: transfer and aggregate nodal features; right part: output
new representations of nodal features).
TONG et al.: DETECTION AND CLASSIFICATION OF TRANSMISSION LINE TRANSIENT FAULTS BASED ON GRAPH CONVOLUTIONAL NEURAL NETWORK 459

Node voltage waveform of three-phase Node voltage waveform of three-phase


1.2 1.2

1.0 1.0

Voltage (p.u.)

Voltage (p.u.)
0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2
0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 160
Sampling number (episode) Sampling number (episode)
(a) Single-phase-ground (b) Two-phase

Node voltage waveform of three-phase Node voltage waveform of three-phase


1.2 1.2

1.0 1.0

0.8 0.8

Voltage (p.u.)
Voltage (p.u.)

0.6
0.6
0.4
0.4
0.2
0.2
0.0
0.0
0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 160
Sampling number (episode) Sampling number (episode)
(c) Two-phase-ground (d) Three-phase

Fig. 3. Data of different fault types.

18kV 230kV load B 230kV 13.8kV


j0.0625 j0.0586
0.0085+j0.072 0.0119+j0.1008
Gen 2 Gen 3
2 8
18/230 230/13.8 3
Bus 2 Bus 8 Bus 3 e78 e89
e27 e39
Bus 7 Bus 9
0.032+j0.161
7 9
0.039+j0.170
e57 e69
Bus 5 Bus 6
5 6
0.010+j0.085
load A 0.017+j0.092 load C 0.010+j0.085 e45 e46
230kV 4
Bus 4
16.5/230 j0.0576 e14
Bus 1 1
16.5KV Bus 1

Gen 1

Fig. 4. Graph construction (bus → node; line → edge; line impedance → edge weight).

these two nodes in the graph. does not quantify the correlation between nodes. In reality,
In addition, we reckon that edges should be informative. The if the bus nodes are far apart from each other, the similarity
consideration of nodal features and the existence of edges only between them may not be high even if they are connected. If
covers the topological connections in spatial information, but the edge weights are not considered according to the actual
460 CSEE JOURNAL OF POWER AND ENERGY SYSTEMS, VOL. 7, NO. 3, MAY 2021

scene, the aggregation of neighborhood information will be result when no fault occurs, and will detect the corresponding
less accurate in the process of graph convolution. Therefore, fault when a specific fault type occurs.
we create a calculation criterion to get the weight of each To illustrate the whole workflow concisely, voltages are
edge using the line parameters. The calculation formula is as selected as the nodal features. Then the input matrix X of
follows: GCN can be written in the form of the following matrix (IEEE
 1 9-bus system as an example):

q if node i, j are connected  
eij = 2
Rij + Xij2
(3) u1,t1 u1,t2 · · · u1,ts
 u2,t1 u2,t2 · · · u2,ts 
0 else

X= . (4)
 
.. .. .. 
 .. . . . 
where eij represents the edge weight coefficient between node u9,t1 u9,t2 · · · u9,ts 9×s
i and node j, and Rij and Xij represent the line resistance
and reactance parameters respectively. The significance of where each row of X represents the series of voltage magni-
this equation is that a longer distance or a larger impedance tude of a particular node in IEEE 9-bus system and s means the
indicates a smaller correlation between nodes. The idea of number of voltage samplings in a sampling time. We consider
introducing this criterion comes from the definition of edges using a multi-layer GCN for supervised fault classification.
in relevant applications of social recommendation [24]. In We first construct the self-normalized adjacency matrix Â
the interpersonal graph, edge information is used to represent according to the edge weights calculated by formula (3), and
the “user-user” relationship, which is usually called “social then our feedforward network takes the simple form:
distance”. Therefore, we come up with the idea of extending  
the “social distance” to the “Electrical distance” in the power H (l+1) = σ ÂH (l) W (l) l = 0, 1, 2, . . . (5)
system. Specifically, “social distance” is the quantification of
Here, σ is an activation function such as RELU, Sigmoid,
the relationship between people, while “electrical distance” is
and H (0) ∈ Rn×s is equal to X; if we assume that the
the quantification of the spatio-temporal correlation of power
first hidden layer has H feature maps, then W (0) ∈ Rs×h
data. In addition, we propose (3) to provide a reasonable idea
is an input-to-hidden weight matrix. The lth hidden-layer also
for constructing edge weights. In reality, there can be various
determines the vector dimension by designing the number of
ways of defining edge information according to different
feature maps. As described in Section II-B, the convolution
requirements.
operation process will not be repeated here. Here we assume
Finally, we construct the graph structure based on the
that the final GCN layer output H ∈ Rn×m .
transmission system, as shown in Fig. 4.
Since it is a classification task, the last layer of our model
C. Workflow of GCN Based on Fault Detection and Classifi- needs to be a fully connected layer, as shown in Fig. 5.
cation The output features of the last hidden layer are stacked into
a long feature vector S (i) , which is used as the input vector
After building the graph structure, the next step is to build of a softmax classifier. The length of S (i) , ns , is calculated as
a GCN model based on fault detection and classification.
Concretely, this network is expected to output a non-faulty ns = n × m (6)

h1,1
Softmax classifier


h1,m
h1,1 h1,m

Fault
type


hn,1

hn,m
hn,1 hn,m

S (i)

Fig. 5. Schematic diagram of softmax classifier.


TONG et al.: DETECTION AND CLASSIFICATION OF TRANSMISSION LINE TRANSIENT FAULTS BASED ON GRAPH CONVOLUTIONAL NEURAL NETWORK 461

where n means the number of nodes (if IEEE 9-bus system, n G G


is equal to 9), and m represents the number of nodal features 30 37
at the GCN output layer. 25 26 28 29
Concretely, softmax classifiers are based on the softmax 2 27
regression model, which is an extension of logistic regression 38
1
model and is able to solve multi-class problems [25]. For the 3 18 17 G
softmax classifier, the probability of the ith stacked G
 input vector
S (i) belonging to class j, i.e. P Y = j | S (i) , is calculated 39 16 21
as T (i)
15
  eθj s
P Y = j | S (i) = K (7) G
P θT s(i)
el 4 14 24 36
l=1

where Y is the stochastic variable of the output class cor- 5 13 23


responding to S (i) and θj ∈ Rns is the parameter vector 6
9 12 19
for class j, where j = 1, 2, . . . , K. Consequently, for fault 11 20 22
7
detection and classification problem with 5 types (4 fault
types and non-faulty), a 5-dimensional vector containing all 10
8
5 probabilities should be given as the output of the softmax 31 32 34 33 35
classifier. We then assign x(i) to denote the fault type with G G G G G
the highest probability:
  Fig. 6. Topology of IEEE39 transmission system.
t(i) = arg max P Y = j | S (i) (8)
j

where t(i) represents the ith class of the output layer. Finally, B. Selection of Network Parameters
the network is backpropagated and trained according to the 1) Simulation Parameters
error between the output category and the real fault label.
As mentioned above, we build an IEEE 39-bus standard
D. Framework for Fault Detection and Classification system simulation model based on PSCAD/EMTDC. To obtain
independent and identically distributed samples, we configure
The overall framework of fault detection and classification
three types of fault parameters, i.e. different fault types, fault
approach based on GCN proposed in this paper is shown in
positions, and fault resistances as shown in Table I. The fault
Fig. 7, which includes: (i) Construct transmission line model
inception time is 1.0 s and fault duration is 0.1 s. The sampling
based on PSCAD/EMTDC simulation software in order to
frequency in PSCAD is set as 4 kHz. That is to say, 400 fault
generate massive fault data samples through Python scripts. (ii)
sampling values will be generated in the fault period of 0.1 s.
Build a graph classification model, including graph structure
We take 80 out of the 400 sampling values at equal intervals
construction and parameter settings of the neural network. (iii)
as nodal features. The transmission system used in this paper
Visualize the results. The three steps communicate and inte-
has 34 transmission lines, and each line has 10 fault points.
grate through Python API, and finally achieve the integration
Therefore, we get 5 (fault types) × 34 (lines) × 10 (positions)
goal of simulation, classification and analysis.
× 7 (resistances)= 11900 samples, which are divided into
training set and testing set according to the ratio of 7:3. We
IV. C ASE S TUDY consider these 11900 samples as standard fault data.
A. Transmission System studied and Data Acquisition
TABLE I
To obtain massive labeled transient fault samples, we intend FAULT PARAMETERS U SED FOR S IMULATION
to build a simulated power transmission system with reference Fault parameter Values or types
to the IEEE 39-bus standard test system. Considering that Single-phase-G, two-phase,
Fault type
the generation of fault samples needs an excellent transient Two-phase-G, three-phase, non-faulty
simulation environment, PSCAD/EMTDC is chosen as the 1 2 10
Fault position , ,..., of length of lines
10 10 10
simulation software. Fig. 6 shows the electrical single-line
Fault resistance (Ω) 0.01, 1, 10, 20, 30, 40, 50
diagram of the 39-bus system. The IEEE 39-bus standard test
system consists of 10 generators, 12 three-phase transformers,
34 transmission lines and some loads. 2) GCN Parameters
This task requires sufficient training data, so we generate a The hyperparameter selection of GCN is depicted in Ta-
series of labeled sample data with independent and identical ble II. As reference [26] shows, the hidden layer number in a
distribution by setting up different fault locations, fault types graph convolution network is usually set to 2 or 3. There is a
and fault impedances. We leverage “mrhc-automation” library problem of excessive smoothing in deep graph convolutional
(PSCAD-Python interface library) to realize batch simulations, network [23], which can be simply explained as the features
thereby avoiding repetitive manual operations. of each node tend to be homogeneous. Therefore, we finally
462 CSEE JOURNAL OF POWER AND ENERGY SYSTEMS, VOL. 7, NO. 3, MAY 2021

Fault simulation
begin
Generator Line
parameter parameter
Transmission data input
system
Transformer Load
parameter parameter
fault
simulation
fault
working
Single-phase-G
conditions
Two-phase
Two-phase-G Fault data
Three-phase-G tyoes normalization
Non-faulty

Topology
Training set information Testing set

Data analysis and visualization

promptness Graph Other


convolutional classification
network methods
Classification
robustness performance Classifiers

Fig. 7. Flow chart.

TABLE II
H YPERPARAMETERS OF GCN adjustment of the learning rate.
Hyperparameter Values or types C. Performance of the Proposed Method with Standard Data
GCN layer Three layers (150, 300, 150 neurons)
Loss function Cross-entropy loss To validate the overall performance of our proposed method,
Optimizer Adam we will demonstrate the detection and classification effect
of the proposed method from three aspects: classification
performance in various situations, response speed, robust-
choose a model with 3 hidden layers by testing and comparing ness. Performance comparison with common machine learning
the effects of different layers. The numbers of hidden neurons methods are indispensable. In this section, we first test the
are determined to be 150, 300 and 150 respectively through classification performance of our method with data obtained
constant tuning and optimization [27] And Relu is selected in a standard system.
as the activation function of each layer. When propagating to For clarity, the accuracies and recall rates of the proposed
the last hidden layer, dimension of the nodal features becomes method for the classification of five types are calculated and
150, while the number of nodes remains 39. Thus the input depicted in Table III. The overall classification accuracy is
size of softmax classifier is 39 × 150 = 5850. For supervised 98.28%, and the classification accuracy for each type is higher
multi-classification problems, we usually choose the cross- than 97.4%. This result shows that our proposed method is
entropy error as the cost function because it can be used to capable of classifying faults with quite high accuracies.
calculate the loss through a simple derivative and has a fast rate
TABLE III
of convergence [28]. The calculation formula is as follows: C LASSIFICATION R ESULTS OF THE P ROPOSED M ETHOD FOR D IFFERENT
C
FAULT T YPES
X
CE(p, q) = − pi log (qi ) (9) Fault type Accuracy (%) Recall (%)
Single-phase-G 98.18 98.18
i=1
Two-phase 98.04 98.04
where C represents the number of categories, pi is the true Two-phase-G 97.76 97.76
Three-phase-G 97.48 97.48
value and qi is the predicted value. Non-faulty 99.93 99.93
Adam algorithm is chosen as the optimizer owing to its (Average) 98.28 98.28
fast convergence speed, high learning efficiency and small
memory requirement. It is exceedingly suitable for processing Further, the classification performance of the proposed
large data set pairs and has great processing capacity for method is compared with that of the common machine learning
sparse data and data with noise samples [29]. In our test, algorithms including support vector machine (SVM), decision
Adam performs better than other optimizers such as Stochastic tree (DT), K nearest neighbor algorithm (KNN), random
Gradient Descent (SGD) and Batch Gradient Descent (BGD). forest (RF), linear regression (LR), naive bayes algorithm
Therefore, Adam is finally selected to realize the automatic (NB), fully connected network (FCN) and convolutional neural
TONG et al.: DETECTION AND CLASSIFICATION OF TRANSMISSION LINE TRANSIENT FAULTS BASED ON GRAPH CONVOLUTIONAL NEURAL NETWORK 463

network (CNN). The first six methods belong to the traditional is the closest to ideal classification goal (AUC = 0.9994).
classification algorithm, while the latter two are the neural This result shows that the proposed method not only has high
network end-to-end classification algorithm. To compare the classification accuracy and recall rate, but also has remarkable
performance of various classifiers more comprehensively, we comprehensive performance.
use receiver operating characteristic (ROC) curve to measure
the classification effect, which is a comprehensive index that D. Performance of the Proposed Method With Renewable
can best reflect the overall performance of a classifier in Energy Generation Integration
classification problems [30]. The horizontal axis of ROC curve In order to simulate the operation of a real transmission grid,
represents false positive rate (FPR), while the vertical axis more environmental factors need to be considered. New energy
represents true positive rate (TPR). Formulas of the two are power generation has increasingly become a hot spot in the
as follows: industry. In view of the fact that more and more renewable
Σn FPi energy power generations are connected to power grid, we
FPR = Pn i=1 (10) add the renewable energy module to IEEE 39-bus system to
i=1 (FPi + TNi )
Pn simulate this situation.
TPi
TPR = Pn i=1 (11) We introduce a wind turbine into the IEEE 39-bus system
i=1 (TPi + FNi ) to simulate the renewable energy generation integration, as
where n is the number of fault types, T/F means true or shown in Fig. 9.
false, P/N means positive or negative, and TPi /FPi denotes
the TP/FP of the ith type. So FPR represents the proportion G G
of real negative samples with redect to all negative samples 30 37
in positive-predicted samples. Similarly, TPR represents the 25 26 28 29
proportion of real positive samples with redect to all positive 2 27
samples in positive-predicted samples. By setting different 38
1
thresholds for softmax output, we get different (FPR, TPR) wind G
G 3 18 17
points which constitute the ROC curve. One of the great
advantages of the ROC curve is that when the distribution 39 16 21
of positive and negative samples changes, the curve’s shape
remains basically unchanged. Therefore, this evaluation index
can not only reduce the interference brought by different G
testing sets, but also measure the performance of a model more 4 14 24 36
objectively. Further, the area under curve (AUC) is calculated
in order to quantify classification performance. The results are 5 13 23
6
shown in Fig. 8. 9 12 19

7 11 20 22
ROC curve comparison of different methods
10
1.0 8
31 32 34 33 35
G G G G G
0.8
Fig. 9. Topology of IEEE39 transmission system with a wind turbine.
True positive rate

0.6
The selected wind turbine is a PSCAD-based calculation
svm_average ROC (area=0.9484)
model [31], and we connect it to the No. 3 bus of the IEEE
0.4 rf_average ROC (area=0.9657) 39-bus system. We set the fault parameters as before, and get
knn_average ROC (area=0.9779)
lr_average ROC (area=0.9608) 3500 samples for testing the generalizability of the trained
nb_average ROC (area=0.9260) model. Fig. 10(a) and (b) represent the voltage waveforms of
0.2 DT_average ROC (area=0.8092)
FCN_average ROC (area=0.9916) partial nodes under the two-phase short circuit fault. It can be
CNN_average ROC (area=0.9918)
GCN_average ROC (area=0.9994) seen that the characteristics of the fault data before and after
0.0 the wind turbine integration are apparently different.
0.0 0.2 0.4 0.6 0.8 1.0
False positive rate The classification accuracies of five types of fault data with
the wind turbine are shown in Table IV below. We can see that
Fig. 8. Comparison of ROC curves. when the characteristics of fault data become complicated due
to the wind turbine integration, the well-trained model can still
According to the definitions of TPR and FPR, the ideal goal identify the fault with an averaged accuracy of 97.68%.
should be TPR=1 and FPR = 0. Moreover, the AUC of the In addition, we depict the loss and accuracy curves of the
ideal goal is 1.0. In other words, the closer a ROC curve is training set and the testing set to verify that the model is
to the point (0, 1), the better the classification performance less susceptible to over-fitting. According to the curves shown
will be. We can tell from Fig. 8 that the ROC curve of GCN Fig. 11(a), the loss of the testing set does not increase and
464 CSEE JOURNAL OF POWER AND ENERGY SYSTEMS, VOL. 7, NO. 3, MAY 2021

Node voltage waveform of system with wind generator


Node voltage waveform of standard system
1.1 1.1
1.0 1.0

Voltage (p.u.)

Voltage (p.u.)
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 160
Feature number (time dimension) Feature number (time dimension)
(a) Standard system (b) System with the wind turbine

Fig. 10. Fault waveforms before and after the wind turbine integration.

TABLE IV
C LASSIFICATION ACCURACY (%) OF THE P ROPOSED M ETHOD IN THE 3) Data loss is simulated by arbitrary discarding sampling
P RESENCE OF W IND T URBINE G ENERATOR points and is set as 1% of the total sampling data.
Fault type Standard system with wind turbine We add the three types of bad data to the original sample
Single-phase-G 97.48 set, and get 11900 new samples which are still divided into
Two-phase 97.20
Two-phase-G 97.20
training set and testing set at a ratio of 7:3.
Three-phase-G 96.64 Figures 12 and 13 represent the voltage waveforms of 39
Non-faulty 99.86 nodes before and after adding bad data under the single-phase
(Average) 97.68
ground fault. It is obvious that the waveform of fault data after
adding bad data is more complicated. As can be seen from
Table V, the averaged detection and classification accuracy of
remains very low in the later stage of training convergence, fault samples with bad data is still up to 96.71%. Results of
indicating that the model is not subjected to over-fitting [32]. the testing set indicate that the proposed approach has good
ability of bad data tolerance.
E. Performance of the Proposed Method with Bad Data
Data measurement and acquisition usually brings lots of bad TABLE V
C LASSIFICATION ACCURACY (%) OF THE P ROPOSED M ETHOD IN THE
data in the real power grid. Therefore, we add some bad data P RESENCE OF BAD DATA
to the standard fault data to further test the performance of Fault type Standard fault samples Fault samples with bad data
the model. Single-phase-G 98.18 96.86
Three types of bad data are considered in our paper: Two-phase 98.04 96.78
Two-phase-G 97.76 95.24
1) Inaccurate measuring is simulated by multiplying stan- Three-phase-G 97.48 96.64
dard measurements with a random number ranging from 0.75 Non-faulty 99.93 98.04
to 1.25 and is set as 1% of the total sampling data. (Average) 98.28 96.71
2) Asynchronous sampling is simulated by selecting 5% of
all PMUs and randomly moving the measurements forward or Similarly, the loss and accuracy curves of training and
backward n sampling values. (n ∈ [1, 5], n ∈ Z). testing are depicted in Fig. 14(a) and (b). We can see that

Training and Testing Loss Training and Testing Accuracy


2.00 100
1.75 Standard_Train_loss 90
Wind_Test_loss
1.50 80
Accuracy (%)

1.25 70
Loss

1.00 60
0.75 50
0.50 40
Standard_Train_Acc
0.25 30 Wind_Test_Acc
0.00 20
0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40
Epoch Epoch
(a) Curves of loss (b) Curves of accuracy

Fig. 11. Curves of training and testing loss and accuracy.


TONG et al.: DETECTION AND CLASSIFICATION OF TRANSMISSION LINE TRANSIENT FAULTS BASED ON GRAPH CONVOLUTIONAL NEURAL NETWORK 465

Voltage Wavefotrm of 39 nodes TABLE VI


1.4 AVERAGED C LASSIFICATION ACCURACIES (%) OF THE M ETHODS W HEN
Standard data NOT A LL THE B USES ARE M EASURED
1.2
Name of 13 buses, 13 buses,
Voltage (p.u.) 26 buses 39 buses
1.0 methods discrete clustered
SVM 87.05 88.11 90.47 91.76
0.8 RF 93.78 94.96 95.58 96.35
KNN 86.22 85.99 92.16 91.94
0.6
FCN 86.89 85.43 93.28 95.97
0.4 CNN 95.27 95.23 96.73 97.43
GCN 95.18 96.36 97.48 98.28
0.2

0.0 Training and Testing Loss


0 500 1000 1500 2000 2500 3000 2.00
Numbering of voltage sampling values 1.75 Standard_Train_loss
Bad_Test_loss
1.50
Fig. 12. Voltage waveforms of 39 nodes, “0∼3120” on the x-axis represents
the number of 39 (nodes) × 80 (number of voltages). 1.25

Loss
Voltage Wavefotrm of 39 nodes with bad data 1.00
1.4
Sad data 0.75
1.2 Standard data 0.50

1.0 0.25
Voltage (p.u.)

0.00
0.8 0 5 10 15 20 25 30 35 40
Epoch
0.6 (a) Curves of loss
Training and Testing Accuracy
0.4 100

0.2 90
80
Accuracy (%)

0.0
0 500 1000 1500 2000 2500 3000 70
Numbering of voltage sampling values 60

Fig. 13. Voltage waveforms of 39 nodes with bad data. 50


40
Standard_Train_Acc
30 Bad_Test_Acc
bad data do not cause the proposed method to overfit.
20
F. Performance of the Proposed Method with not All Buses 0 5 10 15 20 25 30 35 40
Epoch
Measured (b) Curves of accuracy
In reality, it is not practical that all the buses are measured.
Therefore, we must verify the effectiveness of the proposed Fig. 14. Curves of training and testing loss and accuracy.
method when not all the buses are measured.
Considering the real situations, we divide the cases where
not all the buses are measured into three categories: Fig. 15. As can be seen from Table VI, the performance of
1) The proportion of measurable nodes is high. GCN is also better in this situation.
2) The proportion of measurable nodes is low, and the At last, a small number of measurable nodes can also be
measurable nodes are clustered. discretely distributed in the topology, as shown in Fig. 16.
3) The proportion of measurable nodes is low, and the And this situation makes the adjacency matrix a diagonal
measurable nodes are discretely distributed. matrix, which further makes GCN unable to extract the spatial
Three new datasets are generated based on the original relations between the data. In this case, GCN is equivalent to
11900 samples, and experiments are implemented respectively. an ordinary neural network, so it does not fail to detect and
All the results are shown in Table VI. classify faults but cannot show its advantages. Table VI shows
Firstly, it can be seen that when the measurable nodes that the averaged accuracy of GCN is not optimal, which
account for the majority, GCN can still achieve better perfor- confirms the above inference.
mance compared with existing machine learning techniques.
G. Response Speed of the Proposed Method
The adjacency matrix in this case no longer represents the
neighborhood information of a few unmeasured nodes. Transient faults in power system often cause great damage
In addition, the 1st to 9th , 25th , 30th , 37th and the 39th nodes in a short time, so the response speed of fault identification
are selected as clustered measurable nodes. Obviously, the 13 ought to be guaranteed. The previous results are obtained
measurable nodes form a local graph which is depicted in when we input 80 fault sample values (the fault duration
466 CSEE JOURNAL OF POWER AND ENERGY SYSTEMS, VOL. 7, NO. 3, MAY 2021

G G Respond speed for fault classification at


different sampling frequencies
30 37
25 26 28 29 100

2 27 97
38
1
3 18 17 G 94
G

Accuracy (%)
39 16 21
91
15

G 88
4 14 24 36
85 0.5 kHZ
2 kHZ
13 23 4 kHZ
5 10 kHZ
6 82
9 12 19 accuracy=97%
7 11 20 22 0.0000 0.0125 0.0250 0.0375 0.0500 0.0750 0.1000
Time after the fault (s)
10
8
31 32 34 33 35 Fig. 17. Response speed for fault identification at different sampling
G G G G G frequencies, “0” point on the x-axis represents the moment of the fault
occurrence.

Fig. 15. Topology of IEEE39 transmission system with 13 clustered


measurable buses.
Obviously, the more sample values of specific fault types
G G
can provide the richer transient characteristics of faults. In
30 37 contrast, sampling at a relatively low frequency [33] will lead
25 26 28 29 to insufficient data integrity.
Besides, what also satisfies us is that when sampling values
2 27
within 0.0375 s after the fault occurs are taken as the model
38
1 input, our method still achieves a high accuracy (above 97%),
3 18 17 G
G as long as the sampling frequency is not too low [33]. Con-
39 16 21 sidering the fact that accuracies under the common sampling
15 frequencies are all above 97%, such response speed is quite
satisfactory.
G
4 14 24 36 H. Robustness of the Proposed Method
13 23 It is essential to ensure that the method used to detect
5
9
6 and classify faults can withstand noise. Noise in power
12 19
transmission system refers to data fluctuation caused by load
7 11 22
20 fluctuations or other uncontrollable events. We further compare
10 the performance of the proposed method with some existing
8
31 32 34 33 35 methods in the presence of noise.
G G G G G Gaussian white noise is added to the fault data to test the
robustness of the proposed method. The signal noise ratio
Fig. 16. Topology of IEEE39 transmission system with 13 discrete measur- (SNR) [34] of the data is 15 dB, 20 dB, 25 dB, 30 dB,
able buses. 35 dB and 40 dB respectively. Other fault parameters remain
the same as introduced in Table I. For comparison, we also
design detection and classification networks of SVM, FCN and
within 0.1 s). For the sake of verifying the sensitivity of the CNN. Radial basis function (RBF) is selected as the kernel of
proposed model, we try to identify the faults in a shorter SVM [35], and 5-fold cross-validation and grid search are used
time. Besides, under the restriction of the data acquisition to determine the appropriate values of the parameters γ and
equipment, the sampling frequency is not very high in practice. C. Finally, γ is set to 0.05 and C is set to 10. The structure
Thus experiments on the sensitivity of the model to faults at of FCN has three hidden layers with fully connected neurons
different sampling frequencies is supplemented. The results and Relu is selected as the activation function. After constant
are depicted in Fig. 17. tuning and testing, the number of neurons in each hidden layer
First of all, the classification accuracy increases as the sam- is set to 1600, 800 and 500 respectively. Besides, the CNN
pling frequency increases. This is in line with our expectation classifier we designed has six convolutional layers (kernel
since more sampling values can be obtained by selecting a size = 3 × 3, stride = 2, padding = 1), three maxpooling
higher sampling frequency under the same sampling interval. layers and three linear layers. LeakyRelu [36] is chosen as the
TONG et al.: DETECTION AND CLASSIFICATION OF TRANSMISSION LINE TRANSIENT FAULTS BASED ON GRAPH CONVOLUTIONAL NEURAL NETWORK 467

activation function. Further, dropout [37] mechanism is also model could still reach over 96%, which is a quite encouraging
used in the CNN classifier. The evaluation results are depicted result. In order to illustrate the marvelous robustness of our
in Fig. 18. method more convincingly, data waveforms under various
scales of noise are shown in Fig. 19. The waveform curves
Anti-noise performance of methods of five colors in the subfigure respectively represent the data
of five categories. Subfigures show that the raw data becomes
99
very chaotic when the SNR is 25 dB, not to mention 15 dB.
96 Moreover, this is only the voltage waveform of a single node
of the transmission system. If the voltages of all nodes in
93 the whole system are considered, the task will be much more
Accuracy (%)

difficult. Our model can still maintain a high classification


90 accuracy in this situation. One explanation for this excellent
anti-noise performance is the “aggregation” effect of graph
87 GCN_accuracy convolution. As mentioned above, one of the core functions
CNN_accuracy of graph convolution network is the aggregation of nodal
84 FCN_accuracy
SVM_accuracy features, which contains spatial information. The process of
accuracy=96% aggregating features offsets some effects of noise. From the
81 accuracy=90%
above discussion, the high robustness of the model is verified.
15 20 25 30 35 40 non-noise
Signal noise retio (SNR) I. Additional Experiments
Fig. 18. Robustness for fault identification at different SNRs. 1) Impedance Boundary of High Impedance Problem
The research of high impedance fault is a big challenge for
According to the results in the figure, we can see that when power system. However, we do not discuss high impedance
the SNR is above 20dB, the classification accuracy of GCN faults too much.

Voltage waveform of 15dB Voltage waveform of 25dB


1.0 1.0

0.8 0.8
Voltage (p.u.)

Voltage (p.u.)

0.6 0.6

0.4 0.4

0.2 0.2

0.0 0.0
0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80
Sampling number (episode) Sampling number (episode)
(a) SNR=15dB (b) SNR=25dB

Voltage waveform of 35dB


1.0

0.8
Voltage (p.u.)

0.6

0.4

0.2

0.0
0 10 20 30 40 50 60 70 80
Sampling number (episode)
(c) SNR=35dB

Fig. 19. Data of different SNRs.


468 CSEE JOURNAL OF POWER AND ENERGY SYSTEMS, VOL. 7, NO. 3, MAY 2021

Firstly, high impedance faults mainly occur in distribution fault conditions (including high impedance fault conditions
networks (15 kV–25 kV), and power transmission systems with and etc.), we need to combine the proposed method with
higher voltage level have a low probability of occurrence of the traditional protection theory to form a complete fault
high impedance fault [38]. For instance, in the case of high identification system [38]. The advantage of our method lies
voltage levels, the grounding medium may be broken down on that we undertake the role of data analysis when the amount
when a high impedance ground fault occurs and then the high of data in the power grid is huge, so that a concise and clear
impedance fault will become a low impedance fault. conclusion can be drawn from the overall analysis.
Secondly, the high impedance fault cannot be identified only In order to determine the impedance boundary that our
through the features of nodal voltage. method can identify the five types of faults, we added extra
As shown in Fig. 20(a) and (b), the nodal voltage waveform experiments, the results of which are depicted in Fig. 21.
of single phase ground fault with resistance of 1 ohm is very At total of 3500 samples are simulated and tested (700 sam-
similar to that of three-phase short-circuit fault with resistance ples for each fault resistance). It can be seen from Fig. 21 that
of 100 ohm. And when the fault resistance is 300 ohm, the the highest fault impedance of the sample that our detection
nodal voltage waveform of three phase short circuit fault and classification model can classify with an accuracy rate of
even tends to the normal operating condition, as depicted not less than 95% is 55 ohm. In addition, our detection and
in Fig. 20(c). The above conditions make it very difficult classification model can identify fault samples with a fault
for only data-driven methods to accurately identify the fault impedance of 63 ohm under the condition that the accuracy
types. In general, high impedance problems require many rate is not less than 90%.
effective features such as the functional relationship between 2) Validity of Adjacency Matrix
fault resistance and voltage variation before we can use these In this paper, we use the GCN-based method to detect and
features to realize big data level identification. classify power system transient faults and its main advantage
Thirdly, the proposed approach starts from the perspective is the explicit extraction of spatio-temporal relations between
of mass data processing in the power grid. In fact, if we data. Further, we add comparative experiments to verify the
want to realize the detection and classification of various effective role of spatial information on fault detection and

Nodal voltage waveform (single phase fault—1 ohm) Nodal voltage waveform (three phase fault—100 ohm)
1.2 1.2

1.1 1.1

1.0 1.0
Voltage (p.u.)

Voltage (p.u.)

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6

0.5 0.5

0.4 0.4
0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 160
Sampling number (episode) Sampling number (episode)
(a) single_waveform_1ohm (b) three_waveform_100ohm
Nodal voltage waveform (three phase fault—300 ohm)
1.2

1.1

1.0
Voltage (p.u.)

0.9

0.8

0.7

0.6

0.5

0.4
0 20 40 60 80 100 120 140 160
Sampling number (episode)
(c) three_waveform_300ohm

Fig. 20. Nodal voltage waveforms under different fault impedances.


TONG et al.: DETECTION AND CLASSIFICATION OF TRANSMISSION LINE TRANSIENT FAULTS BASED ON GRAPH CONVOLUTIONAL NEURAL NETWORK 469

TABLE VII
C LASSIFICATION ACCURACIES (%) OF GCN BASED ON D IFFERENT M ATRICES
Matrix type Only added in the first layer Added in the first two layers Added in the first three layers
Gaussion matrix 96.47 96.18 95.29
Uniform matrix 85.88 72.83 72.55
All-ones matrix 75.91 75.07 73.95
Identity matrix 96.76 96.76 96.76
Unweighted adjacency matrix 97.87 97.65 98.23
Weighted adjacency matrix 97.76 97.76 98.28

classification tasks. The above experiments prove that explicitly extracting the
In essence, the difference between GCN and general neural spatio-temporal relations between nodal data helps to improve
networks lies on the adjacency matrix which is used to the accuracy of transient fault detection and classification, and
represent topological information. Therefore, we replace the the adjacency matrix is the key factor.
weighted adjacency matrix A with different matrices (with the
same dimension as A) in the GCN framework, and retrain the
V. C ONCLUSION
model to compare the detection and classification results. The
results are shown in Table VII. This paper presents a novel method for the detection and
classification of power transient faults. Considering electric
Impedance boundary experiment
100 power data is a kind of spatio-temporal data, we regard the
transmission line topology as a graph, so as to construct a
95 graph classification model. Firstly, we propose a method for
defining nodes and edge weights in the power grid topology.
Accuracy (%)

90 Secondly, we embed the topology information into the network


so that the data of a single fault sample contains both temporal
85
relationship and explicit spatial information, which provides
80
more prior knowledge for the task and helps to improve the
performance of the classifier. Experimental results on various
75 Accuracy situations show that the proposed method can distinguish
accuracy=95%
accuracy=90% several kinds of transient faults with high accuracies and strong
70 generalizability. Further, the proposed method still shows
50 55 60 63 65 70 75
Fault impedance (ohm) sensitive and stable performance in the evaluation of response
speed and robustness. We hold that the introduction of GCN
Fig. 21. Impedance boundary experiment. is of great significance to the safe and stable operation of
transmission system and even to the whole power system.
The weighted adjacency matrix used in previous experi- However, the graph convolution method introduced in this
ments is replaced by different matrices, including the standard paper is a spectral convolution which has a solid theoretical
Gaussian distribution matrix whose elements follow standard foundation but poor flexibility. For example, once the adja-
normal distribution [39], the standard Uniform distribution cency matrix of a graph is determined, the structure of the
matrix [40], all-ones matrix (the matrix where all the elements graph is fixed, so dynamic grid structure cannot be dealt with.
are 1), Identity matrix and unweighted adjacency matrix whose In addition, the edge weight mentioned in this paper must
elements are only “0” and “1”. It can be seen from the table be more significant for fault location. We will consider using
that, non-adjacency matrices cannot represent the true and dynamic graph NN to solve the fault detection, classification
accurate spatial information of the transmission topology, thus and location of dynamic power grid, which ought to be a more
the accuracy of detection and classification decreases. Adding meaningful work.
such incorrect matrices to more GCN layers would reduce
the accuracy even more. Moreover, we can see that different
matrices have different negative effects on the accuracy of R EFERENCES
the model. However, adding correct adjacency matrix in a [1] D. X. Zhang, X. Miao, L. P. Liu, Y. Zhang, and K. Y. Liu, “Research
GCN layer means that the aggregation of nodal features on development strategy for smart grid big data,” Proceedings of the
CSEE, vol. 35, no. 1, pp. 2–11, Jan. 2015.
and transform of fault information are implemented in this
[2] X. He, Q. Ai, R. C. Qiu, W. T. Huang, L. J. Piao, and H. C. Liu, “A
layer, so the accuracy is the highest compared with adding big data architecture design for smart grids based on random matrix
other matrices. Besides, the unweighted adjacency matrix only theory,” IEEE Transactions on Smart Grid, vol. 8, no. 2, pp. 674–686,
contains topological structure information but no parameter Mar. 2015.
[3] Z. Ling, D. Zhang, R. C. Qiu, Z. Jin, Y. Zhang, X. He, and H. Liu,
information [41], while GCN still achieves excellent accuracy. “An accurate and real-time method of self-blast glass insulator location
In theory, we reckon that edge weights can help the adjacency based on faster R-CNN and U-net with aerial images,” CSEE Journal
matrix aggregate nodal features more accurately in model of Power and Energy Systems, vol. 5, no. 4, pp. 474–482, Dec. 2019.
[4] M. K. Neyestanaki and A. M. Ranjbar, “An adaptive PMU-based wide
training. But the results show the fault classification network area backup protection scheme for power transmission lines,” IEEE
is not very sensitive to edge weights. Transactions on Smart Grid, vol. 6, no. 3, pp. 1550–1559, May 2015.
470 CSEE JOURNAL OF POWER AND ENERGY SYSTEMS, VOL. 7, NO. 3, MAY 2021

[5] Z. X. Li, X. G. Yin, Z. Zhang, and Z. Q. He, “Wide-area protection machinelearningmastery.com/how-to-configure-the-number-of-layers-


fault identification algorithm based on multi-information fusion,” IEEE and-nodes-in-a-neural-network/.
Transactions on Power Delivery, vol. 28, no. 3, pp. 1348–1355, Jul. [28] Z. Y. Qin, D. Kim, and T. Gedeon, “Rethinking softmax with cross-
2013. entropy: neural network classifier as mutual information estimator,”
[6] J. J. Song, E. Cotilla-Sanchez, G. Ghanavati, and P. D. H. Hines, arXiv:1911.10688, 2019.
“Dynamic modeling of cascading failure in power systems,” IEEE [29] D. Kinga and J. Adam, “A method for stochastic optimization int,” in
Transactions on Power Systems, vol. 31, no. 3, pp. 2085–2095, May Conf on Learning Representations (ICLR), 2015.
2016. [30] C. E. Metz, “Basic principles of ROC analysis,” Seminars in Nuclear
[7] G. Khazal and A. Zamyatin, “Feature engineering for Arabic text Medicine, vol. 8, no. 4, pp. 283–298, Oct. 1978.
classification,” Journal of Engineering and Applied Sciences, vol. 14, [31] S. K. Kim and E. S. Kim, “PSCAD/EMTDC-based modeling and
no. 7, pp. 2292–2301, 2019. analysis of a gearless variable speed wind turbine,” IEEE Transactions
[8] W. Sun, G. A. Yang, Q. Chen, A. Palazoglu, and K. Feng, “Fault on Energy Conversion, vol. 22, no. 2, pp. 421–430, Jun. 2007.
diagnosis of rolling bearing based on wavelet transform and envelope [32] S. Yeom, I. Giacomelli, M. Fredrikson, and S. Jha, “Privacy risk in ma-
spectrum correlation,” Journal of Vibration and Control, vol. 19, no. 6, chine learning: analyzing the connection to overfitting,” in Proceedings
pp. 924–941, Apr. 2013. of the IEEE 31st Computer Security Foundations Symposium, 2017, pp.
[9] Z. G. Liu, Z. W. Han, Y. Zhang, and Q. G. Zhang, “Multiwavelet packet 268–282.
entropy and its application in transmission line fault recognition and [33] Instrument Transformers— Part 9: Digital Interface for Instrument
classification,” IEEE Transactions on Neural Networks and Learning Transformers, IEC 61869-9, 2016.
Systems, vol. 25, no. 11, pp. 2043–2052, Nov. 2014. [34] W. T. Li and M. Wang, “Identifying overlapping successive events using
[10] A. Jamehbozorg and S. M. Shahrtash, “A decision-tree-based method a shallow convolutional neural network,” IEEE Transactions on Power
for fault classification in single-circuit transmission lines,” IEEE Trans- Systems, vol. 34, no. 6, pp. 4762–4772, Nov. 2019.
actions on Power Delivery, vol. 25, no. 4, pp. 2190–2196, Oct. 2010. [35] U. B. Parikh, B. Das, and R. Maheshwari, “Fault classification technique
[11] A. Recioui, B. Benseghier, and H. Khalfallah, “Power system fault detec- for series compensated transmission line using support vector machine,”
tion, classification and location using the K-nearest neighbors,” in Pro- International Journal of Electrical Power & Energy Systems, vol. 32,
ceedings of the 4th International Conference on Electrical Engineering no. 6, pp. 629–636, Jul. 2010.
(ICEE), 2015, pp. 1–6. [36] A. L. Maas, A. Y. Hannun, and A. Y. Ng, “Rectifier nonlinearities
[12] K. J. Chen, J. Hu, and J. L. He, “Detection and classification of improve neural network acoustic models,” in Proceedings of the 30th
transmission line faults based on unsupervised feature learning and International Conference on Machine Learning, 2013, p. 3.
convolutional sparse autoencoder,” IEEE Transactions on Smart Grid, [37] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhut-
vol. 9, no. 3, pp. 1748–1758, May 2018. dinov, “Dropout: a simple way to prevent neural networks from over-
[13] B. W. Guo, L. Li, and Y. Luo, “A new method for automatic seismic fitting,” The Journal of Machine Learning Research, vol. 15, no. 1, pp.
fault detection using convolutional neural network,” presented at 2018 1929–1958, Jan. 2014.
SEG International Exposition and Annual Meeting, Anaheim, California, [38] A. Ghaderi, H. L. Ginn III, and H. A. Mohammadpour, “High impedance
2018, pp. 1951–1955. fault detection: a review,” Electric Power Systems Research, vol. 143,
[14] X. Shi, R. Qiu, Z. N. Ling, F. Yang, H. S. Yang, and X. He, “Spatio- pp. 376–388, Feb. 2017.
temporal correlation analysis of online monitoring data for anomaly [39] P. J. Forrester, “Log-gases and random matrices,” Journal of Statistical
detection and location in distribution networks,” IEEE Transactions on Physics, vol. 134, no. 3, pp. 443–462, 2010.
Smart Grid, vol. 11, no. 2, pp. 995–1006, Mar. 2020. [40] W. Blajer, “A geometrical interpretation and uniform matrix formulation
of multibody system dynamics,” ZAMM Journal of Applied Mathematics
[15] H. S. Yang, R. C. Qiu, X. Shi, and X. He, “Unsupervised feature
and Mechanics, vol. 81, no. 4, pp. 247–259, Apr. 2001.
learning for online voltage stability evaluation and monitoring based
[41] R. Seidel, “On the all-pairs-shortest-path problem in unweighted undi-
on variational autoencoder,” Electric Power Systems Research, vol. 182,
rected graphs,” Journal of Computer and System Sciences, vol. 51, no.
pp. 106253, May 2020.
3, pp. 400–403, Dec. 1995.
[16] Z. Zhang, D. Zhang, and R. C. Qiu, “Deep reinforcement learning for
power system applications: An overview,” CSEE Journal of Power and
Energy Systems, vol. 6, no. 1, pp. 213–225, Mar. 2020.
[17] Y. Ma, C. Huang, Y. Sun, G. Zhao, and Y. J. Lei, “Review of power Houjie Tong received the B.S. degree from Depart-
spatio-temporal big data technologies for mobile computing in smart ment of Electrical Engineering, North China Electric
grid,” IEEE Access, vol. 7, pp. 174612–174628, Dec. 2019. Power University in 2019. He is currently pursu-
[18] H. S. Yang, R. C. Qiu, L. Chu, T. B. Mi, X. Shi, and C. M. ing the M.S. degree in Electrical Engineering from
Liu, “Improving power system state estimation based on matrix-level Shanghai Jiao Tong University. His current research
cleaning,” IEEE Transactions on Power Systems, vol. 35, no. 5, pp. interests include machine learning and graph deep
3529–3540, Sep. 2020. learning with their applications in power system.
[19] J. Bruna, W. Zaremba, A. Szlam, and Y. LeCun, “Spectral networks and
locally connected networks on graphs,” arXiv:1312.6203, 2013.
[20] K. Xu, W. H. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph
neural networks?” arXiv:1810.00826, 2018.
[21] A. P. Bradley, “The use of the area under the ROC curve in the evaluation
Robert C. Qiu (F’15) received the Ph.D. degree
of machine learning algorithms,” Pattern Recognition, vol. 30, no. 7, pp.
in Electrical Engineering from New York Univer-
1145–1159, Jul. 1997.
sity. He is currently the dean of the School of
[22] Z. H. Wu, S. R. Pan, F. W. Chen, G. D. Long, C. Q. Zhang, and P. S. Yu, Telecommunications, Huazhong University of Sci-
“A comprehensive survey on graph neural networks,” IEEE Transactions ence and Technology, and serves as a Professor
on Neural Networks and Learning Systems, vol. 32, no. 1, Jan. 2021. in the Research Center for Big Data Engineering
[23] T. N. Kipf and M. Welling, “Semi-supervised classification with graph and Technologies, State Energy Smart Grid R&D
convolutional networks,” arXiv:1609.02907, 2016. Center, Department of Electronics and Electrical
[24] W. Q. Fan, Y. Ma, Q. Li, Y. He, E. Zhao, J. L. Tang, and D. W. Yin, Engineering, Shanghai Jiao Tong University. He was
“Graph neural networks for social recommendation,” in Proceedings of with GTE Laboratories, Inc., Waltham and Bell
World Wide Web Conference, 2019, pp. 417–426. Labs, Lucent Technologies. He was the Founder-
[25] C. Hung, J. Nieto, Z. Taylor, J. Underwood, and S. Sukkarieh, “Orchard CEO and the President of Wiscom Technologies, Inc., manufacturing and
fruit segmentation using multi-spectral feature learning,” in Proceedings marketing WCDMA chipsets. In 2008, he became a Professor at the Center for
of 2013 IEEE/RSJ International Conference on Intelligent Robots and Manufacturing Research, Department of Electrical and Computer Engineering,
Systems, 2013, pp. 5314–5320. Tennessee Technological University. He was named a fellow of IEEE in 2015
[26] Q. M. Li, Z. C. Han, and X. M. Wu, “Deeper insights into graph con- for his contributions to ultra-wideband wireless communications. His current
volutional networks for semi-supervised learning,” arXiv:1801.07606, research interests include wireless communication and networking, random
2018. matrix theory based theoretical analysis for deep learning, and smart grid
[27] J. Brownlee. (2018, Jul. 27). How to configure the number of technologies.
layers and nodes in a neural network. [Online]. Available: https://
TONG et al.: DETECTION AND CLASSIFICATION OF TRANSMISSION LINE TRANSIENT FAULTS BASED ON GRAPH CONVOLUTIONAL NEURAL NETWORK 471

Dongxia Zhang received the M.S. degree in Elec- Qi Ding is currently pursuing the master degree
trical Engineering from the Taiyuan University of from Shanghai Jiao Tong University. His current
Technology, Taiyuan, Shanxi, China, in 1992 and research interests include model compression and
the Ph.D. degree in Electrical Engineering from machine learning application on Smart Grid.
Tsinghua University, Beijing, China, in 1999. From
1992 to 1995, she was a Lecturer with Taiyuan
University of Technology. Since 1999, she has been
working at China Electric Power Research Institute.
She is the co-author of four books, and more than 40
articles. Her research interests include power system
analysis and planning, big data and AI applications
in power systems. She is an Associate Editor of Proceedings of the CSEE.

Haosen Yang received the B.S degree from the Xin Shi (S’19) received the Ph.D. degree from the
South China University of Technology in 2017, and School of Electronics and Electrical Engineering,
the M.S degree in Shanghai Jiao Tong University, Shanghai Jiao Tong University, Shanghai, China. He
2019. He is pursuing the Ph.D. degree from the is now a Lecturer in the School of Control and
School of Electronics and Electrical Engineering, Computer Engineering, North China Electric Power
Shanghai Jiao Tong University. His research interests University, Beijing, China. His research interests in-
include voltage stability and state estimation of clude power system analysis, random matrix theory,
power grids, machine learning and data science. and machine learning.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy