A New Framework For Smartphone Sensor Based Human Activity Recognition Using Graph Neural Network
A New Framework For Smartphone Sensor Based Human Activity Recognition Using Graph Neural Network
A New Framework For Smartphone Sensor Based Human Activity Recognition Using Graph Neural Network
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2020.3015726, IEEE Sensors
Journal
IEEE SENSORS JOURNAL, VOL. XX, NO. XX, XXXX 2017 1
Abstract— Automatic human activity recognition (HAR) through computing devices is a challenging research topic in the
domain of computer vision. It has widespread applications in various fields such as sports, healthcare, criminal investi-
gation and so on. With the advent of smart devices like smartphones, availability of inertial sensors like accelerometer
and gyroscope can easily be used to track our daily physical movements. State-of-the-art deep neural network models
like Convolutional Neural Network (CNN) do not need any additional feature extraction for such applications. However,
it requires huge amount of data for training which is time consuming, and requires ample resource. Another limiting
factor of CNN is that it considers only the features of an individual sample for learning without considering any structural
information among the samples. To address the aforesaid issues, we propose an end-to-end fast Graph Neural Network
(GNN) which not only captures the individual sample information efficiently but also the relationship with other samples
in the form of an undirected graph structure. To the best of our knowledge, this is the first work where the time series data
are transformed into a structural representation of graph for the purpose of HAR using sensor data. Proposed model has
been evaluated on 6 publicly available datasets, and it achieves nearly 100% recognition accuracy for all the 6 datasets.
Source code of this work is available at https://github.com/riktimmondal/HAR-Sensor.
Index Terms— Human Activity Recognition, Graph Neural Network (GNN), Message Passing, Smartphone sensors, Deep
learning.
1530-437X (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Aristotle University of Thessaloniki. Downloaded on November 09,2020 at 11:39:08 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2020.3015726, IEEE Sensors
Journal
2 IEEE SENSORS JOURNAL, VOL. XX, NO. XX, XXXX 2017
1530-437X (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Aristotle University of Thessaloniki. Downloaded on November 09,2020 at 11:39:08 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2020.3015726, IEEE Sensors
Journal
AUTHOR et al.: A NEW FRAMEWORK FOR SMARTPHONE SENSOR BASED HUMAN ACTIVITY RECOGNITION USING GRAPH NEURAL NETWORK 3
1530-437X (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Aristotle University of Thessaloniki. Downloaded on November 09,2020 at 11:39:08 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2020.3015726, IEEE Sensors
Journal
4 IEEE SENSORS JOURNAL, VOL. XX, NO. XX, XXXX 2017
h(k)
v = N ON − LIN EARIT Y ( Fig. 2: One sample graph showing an activity class generated
COM BIN E(h(k−1)
v , AGGREGAT E(h(k−1)
u : u ∈ N (v)))) for each of the 6 HAR datasets.
(1)
Here NON-LINEARITY is a nonlinear function like Sigmoid,
Output of the GraphConv layer is passed through ReLU
ReLU, ELU; N(v) represents all the neighboring nodes of v
(0) function to introduce non-linearity. It is followed by a Top-
directly connected to it with an edge. We initialize hv =Xv .
KPooling layer which takes X and E as inputs to reduce
The COMBINE and the AGGREGATE functions can be “add”,
the number of graph nodes and convert the graphs into sub-
“mean”, “max”, “concatenation” as per types of GNN models.
graphs which are coarse version of the entire graph capable
In our present work, we have used GraphConv as the
to represent the high-dimensional representation of the same.
message passing layer where the representation vector at layer
This output (O1 ) is passed on to another block of the same
k>0 is represented as,
GraphConv-ReLU-TopKPooling layers whose output (O2 ) is
(k) (k) added with the output O1 . The final O1 + O2 output is
X
h(k)
v = σ(hv
(k−1)
∗ W1 + h(k−1)
u ∗ W2 ) (2)
u∈N (v)
passed through a linear layer with ReLU activation, dropout
layer (dropout rate=0.5). Then, a linear layer with dimension
(k) (k)
Where W1 and W2 represent weight matrices need to learn equal to that of the number of classes of the problem under
by model,Pσ represents component wise NON-LINEARITY consideration, with Log Softmax as the activation function is
function, is the AGGREGATE function. used to produce the final probability vector Z.
(K)
For node classification this hv vector can be directly used eZi
at final Kth layer for label prediction. Whereas for entire LogSof tmax(Zi ) = log( PC ) (4)
graph classification all the nodes v present in the graph G, j=1 eZj
(K)
its embedding vectors hv are aggregated to get the entire Where, Zi is the probability/log ofP its value of ith element
C Zj
graph level prediction using a READOUT function. in the last linear layer vector and j=1 e is the sum of
(K) all probability values of all jth elements including i in the Z
hG = READOU T (h(K)
v : v ∈ G) (3) vector for C number of classes. Our proposed GNN model
Here READOUT is the final aggregation function used to com- is shown in Fig. 3. We have used Negative Log Likelihood
bine all nodes’ v ∈ G embedding vector through operations (NLL) function as our classification objective function which
like “add”, “mean” etc. For our task, we have considered needs to be minimized and can be represented as follows,
both the “mean” and the “max” functions as the aggregation C
X
functions. N LL(Z) = − (yi ∗ LogSof tmax(Zi )) (5)
i=1
A. Proposed GNN based HAR model where yi represents the ground truth label of the ith graph.
Our network consists of a block of GraphConv layer with
two inputs: (a) X which is the feature matrix of each node of IV. EXPERIMENTATION
shape[V,D] where V is the total number of nodes in the graph For each of 6 HAR datasets, mentioned earlier, we have
and D is the dimensionality of the input features vector; and split the dataset into train and test sets in the ratio of 70% and
(b) the adjacency matrix A of shape [V,V] or edge list E of 30% upon which the number of graphs obtained for training
shape [2,L] consisting of all edges present in the entire graph and testing for each dataset is shown in Table II. We have
in the form of pair(V1 ,V2 ) where V1 and V2 are two nodes performed our experiments 5 times, and in each time the
connected by an edge and L is the total number of edges in dataset is randomly shuffled and split into the said train (70%)
the entire graph. and test (30%) sets. Then the model performance is evaluated
1530-437X (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Aristotle University of Thessaloniki. Downloaded on November 09,2020 at 11:39:08 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2020.3015726, IEEE Sensors
Journal
AUTHOR et al.: A NEW FRAMEWORK FOR SMARTPHONE SENSOR BASED HUMAN ACTIVITY RECOGNITION USING GRAPH NEURAL NETWORK 5
1530-437X (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Aristotle University of Thessaloniki. Downloaded on November 09,2020 at 11:39:08 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2020.3015726, IEEE Sensors
Journal
6 IEEE SENSORS JOURNAL, VOL. XX, NO. XX, XXXX 2017
where we have iterated the splitting ratio for train set 10% to
90% (test is done on other 90% to 10% data accordingly) in
steps of 10% to show improvement of our model for increase
in training data. Even for imbalanced dataset where we have TABLE V: COMPARISON OF PROPOSED MODEL WITH
considered two cases,(1) when imbalance is in number of EXISTING MODELS FOR 6 HAR DATASETS
graph per class (obtained mean accuracy is 99.37%), and (2)
when imbalance is in number of nodes in the graphs of a class Dataset Author Year Methodology Accuracy(%)
(obtained mean accuracy is 99.09%). or Classifier
Used
Zheng et 2018 TASG applied 90.5
MobiAct
al.[28] with SVM
Xu et al.[29] 2019 CNN with 98.9
LSTM Model
Zerkouk et 2020 Autoencoder 98
al.[30] with CNN and
LSTM
Proposed 2020 GNN model 100
Quispe et al.[7] 2018 KNN 96.2
WISDM
Zhang et 2019 U-Net 96.4
al.[31]
Burns et al.[32] 2020 Deep triplet 91.3
embedding
Proposed 2020 GNN model 100
MHEALTH
Fig. 5: Progressive accuracies of train and test sets with Gumaei et 2019 Hybrid Deep 99.6
al.[33] Learning
increasing training data to the GNN model for 6 HAR
Model
datasets
Chen et al.[34] 2019 Recurrent 94.05
Convolutional
& Attention
D. Comparison with existing HAR methods Burns et al.[32] 2020 Deep triplet 99.9
We have also compared our methods with some recent embedding
methods. From Table V, it is evident that our model performs Proposed 2020 GNN model 100
better in comparison to existing methods which are considered
Dadkhahi et 2017 Tree structured 97
PAMAP2
from the time-series sensor data through graph representation. Gupta et al.[37] 2020 FECM 89.9
We have set a new benchmark result on 6 standard and publicly Qin et al.[38] 2020 GASF 96.74
available sensor-based HAR datasets. We have also contributed Concatenation
by developing a generic framework to convert time series Proposed 2020 GNN model 100
sensor data into graphical data to be used for training any GNN
USC-HAD
1530-437X (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Aristotle University of Thessaloniki. Downloaded on November 09,2020 at 11:39:08 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2020.3015726, IEEE Sensors
Journal
AUTHOR et al.: A NEW FRAMEWORK FOR SMARTPHONE SENSOR BASED HUMAN ACTIVITY RECOGNITION USING GRAPH NEURAL NETWORK 7
to use other graph models like spline convolution, attention [16] T. Mikolov, S. Kombrink, L. Burget, J. Černock, and S.
based convolution etc. Besides, we have considered only node Khudanpur, “Extensions of recurrent neural network language
feature attributes, in our future work, we will work upon edge model”, in 2011 IEEE international conference on acous-
tics, speech and signal processing (ICASSP), IEEE, 2011,
feature attributes as well. pp. 5528–5531.
[17] W. Xu, Y. Pang, Y. Yang, and Y. Liu, “Human activity
R EFERENCES recognition based on convolutional neural network”, in 2018
24th International Conference on Pattern Recognition (ICPR),
[1] V. Osmani, S. Balasubramaniam, and D. Botvich, “Human
IEEE, 2018, pp. 165–170.
activity recognition in pervasive health-care: Supporting effi-
[18] A. Murad and J.-Y. Pyun, “Deep recurrent neural networks for
cient remote collaboration”, Journal of network and computer
human activity recognition”, Sensors, vol. 17, no. 11, p. 2556,
applications, vol. 31, no. 4, pp. 628–655, 2008.
2017.
[2] Y.-L. Hsu, S.-C. Yang, H.-C. Chang, and H.-C. Lai, “Human
[19] J. M. H. Priyadharshini, S. Kavitha, and B. Bharathi, “Com-
daily and sport activity recognition using a wearable inertial
parative analysis of multilayer backpropagation and multi-
sensor network”, IEEE Access, vol. 6, pp. 31 715–31 728,
channel deep convolutional neural network for human activity
2018.
recognition”, in AIP Conference Proceedings, AIP Publishing
[3] S. Ramasamy Ramamurthy and N. Roy, “Recent trends in
LLC, vol. 2095, 2019, p. 030 014.
machine learning for human activity recognition—a survey”,
[20] L. Wang and R. Liu, “Human activity recognition based
Wiley Interdisciplinary Reviews: Data Mining and Knowledge
on wearable sensor using hierarchical deep lstm networks”,
Discovery, vol. 8, no. 4, e1254, 2018.
Circuits, Systems, and Signal Processing, vol. 39, no. 2,
[4] M. Ehatisham-ul-Haq, M. A. Azam, J. Loo, K. Shuang, S.
pp. 837–856, 2020.
Islam, U. Naeem, and Y. Amin, “Authentication of smartphone
[21] J. Zhou, G. Cui, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li,
users based on activity recognition and mobile sensing”,
and M. Sun, “Graph neural networks: A review of methods
Sensors, vol. 17, no. 9, p. 2043, 2017.
and applications”, arXiv preprint arXiv:1812.08434, 2018.
[5] L. Shi, Y. Zhang, J. Cheng, and H. Lu, “Skeleton-based action
[22] C. Morris, M. Ritzert, M. Fey, W. L. Hamilton, J. E. Lenssen,
recognition with multi-stream adaptive graph convolutional
G. Rattan, and M. Grohe, “Weisfeiler and leman go neural:
networks”, arXiv preprint arXiv:1912.06971, 2019.
Higher-order graph neural networks”, in Proceedings of the
[6] D. Micucci, M. Mobilio, and P. Napoletano, “Unimib shar: A
AAAI Conference on Artificial Intelligence, vol. 33, 2019,
dataset for human activity recognition using acceleration data
pp. 4602–4609.
from smartphones”, Applied Sciences, vol. 7, no. 10, p. 1101,
[23] G. Vavoulas, C. Chatzaki, T. Malliotakis, M. Pediaditis, and
2017.
M. Tsiknakis, “The mobiact dataset: Recognition of activities
[7] K. G. Montero Quispe, W. Sousa Lima, D. Macêdo Batista,
of daily living using smartphones.”, in ICT4AgeingWell, 2016,
and E. Souto, “Mboss: A symbolic representation of human
pp. 143–151.
activity recognition using mobile sensors”, Sensors, vol. 18,
[24] O. Banos, R. Garcia, J. A. Holgado-Terriza, M. Damas, H.
no. 12, p. 4354, 2018.
Pomares, I. Rojas, A. Saez, and C. Villalonga, “Mhealthdroid:
[8] K. Walse, R. Dharaskar, and V. Thakare, “Performance eval-
A novel framework for agile development of mobile health
uation of classifiers on wisdm dataset for human activity
applications”, in International workshop on ambient assisted
recognition”, in Proceedings of the Second International Con-
living, Springer, 2014, pp. 91–98.
ference on Information and Communication Technology for
[25] M. Zhang and A. A. Sawchuk, “Usc-had: A daily activity
Competitive Strategies, 2016, pp. 1–7.
dataset for ubiquitous activity recognition using wearable
[9] J. R. Kwapisz, G. M. Weiss, and S. A. Moore, “Activity
sensors”, in Proceedings of the 2012 ACM Conference on
recognition using cell phone accelerometers”, ACM SigKDD
Ubiquitous Computing, 2012, pp. 1036–1043.
Explorations Newsletter, vol. 12, no. 2, pp. 74–82, 2011.
[26] S. Ruder, “An overview of gradient descent optimization
[10] H. Dadkhahi and B. M. Marlin, “Learning tree-structured
algorithms”, arXiv preprint arXiv:1609.04747, 2016.
detection cascades for heterogeneous networks of embedded
[27] D. P. Kingma and J. Ba, “Adam: A method for stochastic
devices”, in Proceedings of the 23rd ACM SIGKDD Interna-
optimization”, arXiv preprint arXiv:1412.6980, 2014.
tional Conference on Knowledge Discovery and Data Mining,
[28] Z. Zheng, J. Du, L. Sun, M. Huo, and Y. Chen, “Tasg: An
2017, pp. 1773–1781.
augmented classification method for impersonal har”, Mobile
[11] A. Reiss and D. Stricker, “Introducing a new benchmarked
Information Systems, vol. 2018, 2018.
dataset for activity monitoring”, in 2012 16th International
[29] J. Xu, Z. He, and Y. Zhang, “Cnn-lstm combined network
Symposium on Wearable Computers, IEEE, 2012, pp. 108–
for iot enabled fall detection applications”, in Journal of
109.
Physics: Conference Series, IOP Publishing, vol. 1267, 2019,
[12] Z. Jiang, Y. Zheng, H. Tan, B. Tang, and H. Zhou, “Variational
p. 012 044.
deep embedding: An unsupervised and generative approach to
[30] M. Zerkouk and B. Chikhaoui, “Spatio-temporal abnormal
clustering”, arXiv preprint arXiv:1611.05148, 2016.
behavior prediction in elderly persons using deep learning
[13] A. Stisen, H. Blunck, S. Bhattacharya, T. S. Prentow, M. B.
models”, Sensors, vol. 20, no. 8, p. 2359, 2020.
Kjærgaard, A. Dey, T. Sonne, and M. M. Jensen, “Smart
[31] Y. Zhang, Z. Zhang, Y. Zhang, J. Bao, Y. Zhang, and H. Deng,
devices are different: Assessing and mitigatingmobile sensing
“Human activity recognition based on motion sensor using u-
heterogeneities for activity recognition”, in Proceedings of the
net”, IEEE Access, vol. 7, pp. 75 213–75 226, 2019.
13th ACM conference on embedded networked sensor systems,
[32] D. M. Burns and C. M. Whyne, “Personalized activity
2015, pp. 127–140.
recognition with deep triplet embeddings”, arXiv preprint
[14] O. Politi, I. Mporas, and V. Megalooikonomou, “Human
arXiv:2001.05517, 2020.
motion detection in daily activity tasks using wearable sen-
[33] A. Gumaei, M. M. Hassan, A. Alelaiwi, and H. Alsalman, “A
sors”, in 2014 22nd European signal processing conference
hybrid deep learning model for human activity recognition
(EUSIPCO), IEEE, 2014, pp. 2315–2319.
using multimodal body sensing data”, IEEE Access, vol. 7,
[15] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet
pp. 99 152–99 160, 2019.
classification with deep convolutional neural networks”, in
[34] K. Chen, L. Yao, D. Zhang, X. Wang, X. Chang, and F. Nie,
Advances in neural information processing systems, 2012,
“A semisupervised recurrent convolutional attention model
pp. 1097–1105.
for human activity recognition”, IEEE transactions on neural
1530-437X (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Aristotle University of Thessaloniki. Downloaded on November 09,2020 at 11:39:08 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2020.3015726, IEEE Sensors
Journal
8 IEEE SENSORS JOURNAL, VOL. XX, NO. XX, XXXX 2017
networks and learning systems, vol. 31, no. 5, pp. 1747–1756, Pawan Kumar Singh ( email: pawankrs-
2019. ingh@ieee.org ) received his B. Tech degree in
[35] X. Zhang, Y. Wong, M. S. Kankanhalli, and W. Geng, “Hier- Information Technology from West Bengal Uni-
archical multi-view aggregation network for sensor-based hu- versity of Technology in 2010. He received his M.
man activity recognition”, Plos one, vol. 14, no. 9, e0221390, Tech in Computer Science and Engineering and
Ph.D. (Engineering) degrees from Jadavpur Uni-
2019. versity (J.U.) in 2013 and 2018 respectively. He
[36] Z. Qin, L. Hu, N. Zhang, D. Chen, K. Zhang, Z. Qin, also received the RUSA 2.0 fellowship for pursu-
and K.-K. R. Choo, “Learning-aided user identification using ing his post-doctoral research in J.U. in 2019. He
smartphone sensors for smart homes”, IEEE Internet of Things is currently working as an Assistant Professor
Journal, vol. 6, no. 5, pp. 7760–7772, 2019. in the Department of Information Technology in
[37] A. Gupta, H. P. Gupta, B. Biswas, and T. Dutta, “A fault- J.U. He has published more than 50 research papers in peer-reviewed
tolerant early classification approach for human activities journals and international conferences. His areas of current research in-
using multivariate time series”, IEEE Transactions on Mobile terest are Computer Vision, Pattern Recognition, Handwritten Document
Computing, 2020. Analysis, Image and Video Processing, Machine Learning and Artificial
Intelligence. He is a member of the IEEE (U.S.A.), The Institution of
[38] Z. Qin, Y. Zhang, S. Meng, Z. Qin, and K.-K. R. Choo, Engineers (India) and Association for Computing Machinery (ACM) as
“Imaging and fusing time series for wearable sensor-based well as a life member of the Indian Society for Technical Education
human activity recognition”, Information Fusion, vol. 53, (ISTE, New Delhi) and Computer Society of India (CSI).
pp. 80–87, 2020.
[39] S. Ashry, R. Elbasiony, and W. Gomaa, “An lstm-based
descriptor for human activities recognition using imu sensors”,
in Proceedings of the 15th International Conference on Infor-
matics in Control, Automation and Robotics, ICINCO, vol. 1,
2018, pp. 494–501.
[40] S. B. ud din Tahir, A. Jalal, and M. Batool, “Wearable sensors
for activity analysis using smo-based random forest over smart
home and sports datasets”, in 2020 3rd International Confer-
ence on Advancements in Computational Sciences (ICACS),
IEEE, 2020, pp. 1–6. Vikrant Bhateja
[41] S. P. Singh, A. Lay-Ekuakille, D. Gangwar, M. K. Sharma, (email:bhateja.vikrant@ieee.org) is working as
and S. Gupta, “Deep convlstm with self-attention for hu- an Associate Professor, Department of ECE
man activity decoding using wearables”, arXiv preprint in SRMGPC, Lucknow. His areas of research
arXiv:2005.00698, 2020. include digital image and video processing,
computer vision, medical imaging, machine
learning, pattern analysis and recognition.
He has around 155 quality publications in
various international journals and conference
proceedings. He is associate editor of IJSE and
IJACI. He has edited more than 22 volumes of
conference proceedings with Springer Nature and is presently EiC of
IGI Global: IJNCR journal.
Riktim Mondal recieved his B.E degree from
Department of Computer Science and Engineer-
ing, Jadavpur University. His areas of interest
are Computer Vision, Pattern Recognition, Natu-
ral Language Processing and Human Computer
Interface.
1530-437X (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Aristotle University of Thessaloniki. Downloaded on November 09,2020 at 11:39:08 UTC from IEEE Xplore. Restrictions apply.