Intrusion Detection System On IoT With 5G Network
Intrusion Detection System On IoT With 5G Network
Research Article
Intrusion Detection System on IoT with 5G Network Using
Deep Learning
1
School of Computer Science, Lovely Professional University, Phagwara, Punjab, India
2
Department of Computer Science, Babasaheb Bhimrao Ambedkar University (Central University), Satellite Centre, Amethi,
UP, India
3
Department of Computer Science and Engineering, Maharaja Agrasen Institute of Technology, Delhi, India
Received 15 June 2021; Revised 14 February 2022; Accepted 17 February 2022; Published 10 March 2022
Copyright © 2022 Neha Yadav et al. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
The Internet of Things (IoT) cyberattacks of fully integrated servers, applications, and communications networks are increasing at
exponential speed. As problems caused by the Internet of Things network remain undetected for longer periods, the efficiency of
sensitive devices harms end users, increases cyber threats and identity misuses, increases costs, and affects revenue. For productive
safety and security, Internet of Things interface assaults must be observed nearly in real time. In this paper, a smart intrusion
detection system suited to detect Internet of Things-based attacks is implemented. In particular, to detect malicious Internet of
Things network traffic, a deep learning algorithm has been used. The identity solution ensures the security of operation and
supports the Internet of Things connectivity protocols to interoperate. An intrusion detection system (IDS) is one of the
popular types of network security technology that is used to secure the network. According to our experimental results, the
proposed architecture for intrusion detection will easily recognize real global intruders. The use of a neural network to detect
attacks works exceptionally well. In addition, there is an increasing focus on providing user-centric cybersecurity solutions,
which necessitate the collection, processing, and analysis of massive amounts of data traffic and network connections in 5G
networks. After testing, the autoencoder model, which effectively reduces detection time as well as effectively improves
detection precision, has outperformed. Using the proposed technique, 99.76% of accuracy was achieved.
Internet of things
Voice
5G
Sensors
Connected
health
Asset tracking Smart buildings
the model Internet of Things has a different system of interac- tem lists such trustworthy devices as consumers situated
tion technology, including wireless sensor networks. There- near you or in your place of business, you know; other-
fore, in this case, the role of fog calculation is also evident. wise, a trusted entity, e.g., an organization, will easily link
Fog computing or fogging mostly consists of effective distribu- and maintain a degree of confidentiality, whereas devices
tion of data, transmission, stocking, and applications between that are not included in this list can interact on the
data sources and the cloud, through a decentralized network- macrocell phase.
ing and computing framework. Various applications of 5G One of the important issues in 5G network lies in the
network are depicted in Figure 1 which was reproduced from component which is used in the designing as well as at the
the article [5]. deployment phase, as every element needs authentication
with all of the other elements in the network architecture
1.2. The Architecture of 5G Technology. 5G has an integrated even before initiating any operation, whereas in physical
infrastructure that usually updates network modules and layer phase the network, components are also required to
terminals to include a new scenario. Advanced technology be developed through the trusty worthy network compo-
may often be used by service companies to easily take advan- nents. As the traffic on internet is growing continuously,
tage of value-added offers. However, upgradability is based the domain is also constantly updating 5G and IoT technol-
on cognitive radio technology which has many important ogies, lot of security breaches are there which can be easily
characteristics such as the ability of devices to recognize affected due to intrusion-based attacks like Denial-of-
their place, voice, sensors, health, energy, environment, tem- Service Attacks (DDoS) which can not only affect the appli-
perature, etc. In its working environment, cognitive radio cation layer of the Open Systems Interconnection (OSI)
equipment works as a transceiver (beam) that can collect model but also affects the network layer as well. In this
radio signals perceptually and respond to them. Further- paper, the dataset used for the implementation purpose
more, it detects changes in its environment automatically consists of all such kinds of attacks which are feasible on
and thus responds to continuous continuity. not only 5G networks but also IoT-based system. Hence, a
As soon as 4G becomes publicly accessible, package novel technique to detect such type of attacks is elaborated
companies will be forced to adopt a 5G network [6]. To in Section 3 of this paper.
satisfy customer demands and resolve conflicts in the 5G
environment, a fundamental change in the 5G wireless cellu- 1.3. The Contribution of the Paper Is as Follows
lar technology growth is needed. According to the
researchers’ major findings in [7], the majority of wireless (i) Autoencoder-based novel deep learning technique
customers spend essentially 80% of their time indoors and is implemented for the detection of network attacks
20% of their time outdoors (2018)). That is a narrow blend
of both NFV and SDN innovations which are efficiently (ii) Several machine learning algorithms are used for
detected by 5G networks and cyberattacks that mitigate implementation purposes
them. To address this challenge, a new concept or design (iii) A recent benchmark dataset is used for the imple-
technique for planning the 5G cellular architecture has mentation purpose
emerged: distinguishing between outside and inside setups
[8]. The accessibility loss across the building’s boundaries (iv) A comparative analysis of the work with the existing
will be slightly decreased with this design technique. The framework has been provided
user details would be filtered by other user computers as in
the device-to-device communications, so the anonymity of 1.4. The Structure of This Paper Is as Follows. Section 2 out-
this information is the key concern. Closed access ensures lines the literature review based upon various intrusion
the confidentiality of programs at the system stage. The sys- detection systems. The third section reveals the research
Table 1: Comparative analysis of various existing frameworks.
Algorithm/technique/
Year Performance evaluation parameters Dataset Findings
approach
2017 In this work, the authors have improvised the existing deep learning algorithms by
Accuracy, precision, recall, F1 RedIRIS RNN and CNN
[10] modifying the hidden layers.
2017 KDD 99
Accuracy, precision, recall, F1, FAR GRU and RF Minimizing the loss function has helped the researchers to achieve better results.
[11] Cup
2018 Accuracy, precision, recall, F1 score, UNSW_ In this work, feature normalization and conversion of categorical features to numeric values
Wireless Communications and Mobile Computing
BLSTM RNN
[12] MR, FAR, detection time NB15 have helped them to generate improvised results.
2018
Accuracy, precision, recall, F1 measure NSL-KDD DNN In this work, SGD was used to minimize the loss function of DNN.
[13]
2019 MLP, 1d-CNN, LSTM, In this work, researchers have balanced the dataset by performing data processing in which
Accuracy, precision, recall CICIDS2017
[14] and CNN+LSTM they have duplicated the records.
2019 In this work, deep neural network was optimized by assigning a cost function to each layer of
Precision, recall (TPR), F1 score NSL-KDD DBN
[15] their proposed model.
2019
Accuracy, precision, recall, F1 measure NSL-KDD SDPN SMO algorithm is used for optimal selection of features.
[16]
2020
Accuracy, precision, recall, F1 measure NSL-KDD RF In this work, Weka tool was used for evaluation purposes.
[17]
2021 NSL-KDD, In this work, the stack-based feature selection technique has been proposed to optimize the
Accuracy, precision, recall, F1 measure ANN
[18] KDD99 computation time.
2021
F1 score Bot-IoT RF, NB, and MLP In this work, a hierarchical approach was used for intrusion detection.
[19]
2021
Accuracy, precision, recall, F1 measure CICIDS2017 HW-DBN In this work, low frequency attack was detected.
[20]
3
4 Wireless Communications and Mobile Computing
3. Proposed Methodology
areas, including the identification of images, speech, natu-
3.1. Background of Deep Learning Architectures. Deep learn- ral language processing, social network filtering, and so
ing is one of the popular techniques of data mining. Tief- forth. In addition to finding correlations between vast data
skin’s learning is a valuable algorithm for modelling from different sources to carry out attribute learning, clas-
abstract subjects and relationships over two neural layers sification, or classification tasks at the same time, deep-sea
[27]. Deep learning is currently being studied in several learning algorithms vary in their ability. Various deep
Wireless Communications and Mobile Computing 5
sinpkt
dinpkt
sjit
service_ftp_data
djit
swin
stcpb
dtcpb
dwin
tcprtt
ackdat
smean
dmean
trans_depth
ct_srv_snc
ct_state_ttl
ct_dst_ltm
ct_src_dport_ltm
ct_dst_src_ltm
is_ftp_login
ct_flw_cmd
ct_flw_http_mthd
ct_src_ltm
is_sm_ips_ports
proto_tcp
service_http
service_irc
service_pop3
service_smtp
service_snmp
label
response_body_len
ct_src_dst
proto_udp
service_dns
service_ftp
service_radius
service_ssh
service_ssl
state_CON
state_FIN
state_INT
state_REQ
service_dhcp
state_RST
synack
learning techniques which are used by different researchers (ii) Autoencoder (AE). Gao et al. [28] is a massive neu-
for the development of intrusion detection system are ral network widely used to minimize dimensional-
discussed as follows: ity by having improved data representation
compared to raw inputs. The AE contains layers
(i) Generative Architectures. Unlisted raw data of the same number of feature vectors, in addition
dynamically train algorithms to carry out different to a hidden layer with a low-dimensional represen-
activities. This is the most general architecture in tation. An AE incorporates and trains an encoder
the architectural class category. and decoder with a backpack. As knowledge is
6 Wireless Communications and Mobile Computing
XGBoost classification
1.0
0.8
0.6
0.4
0.2
0.0
0 25 50 75 100 125 150 175 200
Prediction
Real_values
Adaboost classification
1.0
0.8
0.6
0.4
0.2
0.0
translated into small abstraction, it captures brutal remains valuable. Sparsity constraints are intended
characteristics and learns the representation of to ensure that most neurons are inactive [31] in the
details. Afterward, the decoder receives small dis- low average output.
plays and reconstructs original functions [29].
Some AE extensions like AE (SAE), sparse AE, (v) Denoising AE. Denoise is built on the use of
and denoise AE are available. skewed data for refined data view where hidden
layers only use stable characteristics vectors [32].
(iii) SAE. Cascade through a vast network and SAE via
more than one hidden sheet. The features used to (vi) Restricted Boltzmann Machine (RBM). As a proba-
create a new data display are more thoroughly bilistic neural network, the Boltzmann (BM) was
studied [30]. created by Hinton and Sejnovsk. A BM network
consists of binary units symmetrically connected
(iv) Sparse AE. Units covered in scarce AE are of a little which specifies which units are permitted. Several
sparse size. While there are many hidden units interactions between units, however, contribute to
available to research data representations, AE very slow learning [33]. RBM is the unidirectional
Wireless Communications and Mobile Computing 7
0.8
0.6
0.4
0.2
0.0
0.8
0.6
0.4
0.2
0.0
Prediction
Real_values
paradigm of Smolensky 1986 which solves BM’s another learning network. RBM may be used as a
uncertainty. The principle of RBM is that neuronal grouping model as well. A nonlinear, autonomous
connections are removed on the same sheet. RBM classification of Larochell and Bengaio was taught
contains a translucent layer and an occult layer of to be the discriminatory Boltzmann (DRBM)
latent (hidden) variables for initial input variables. device. If some of the Boltzmann press are water-
Both units are connected to the hidden layer in a falls, this is called a deeper Boltzmann (DBM).
clear layer with corresponding weights. The feature
distribution learns the units covered by the input (vii) Deep Belief Network (DBN). The DBN consists of
variables. As an initial stage, RBM is typically stacked RBMs, which have been trained in a pessi-
employed as a preprocessing feature extractor or mistic way. In comparison to the previous RBM,
for initializing other network parameters, for each RBM is trained and represents a contribution
8 Wireless Communications and Mobile Computing
Layer (Type) Outout shape Param # Layer (Type) Outout shape Param #
to the hidden layer of the RBM. The algorithm for Figure 10: Proposed model summary.
profound learning is efficient and rapid to move.
In case an additional layer of bias is applied, No. Name Type Description
DBN is generalized for both dimensional reduction
and independent classification for practical 0 1 Srcip Nominal Source IP address
applications.
1 2 Sport Integer Source port number
(viii) Recurrent Neural Network (RNN). Holand sug-
gested RNN is a dynamic network of nerve feed 2 3 Dstip Nominal Destination IP address
in 1982. In normal transmission, depending upon
the neural network architecture and its depen- 3 4 Dsport Integer Destination port number
dency, the output of each layer will consist of the
same unit of the neuron. The discrepancy in the 4 5 Proto Nominal Transaction Protocol
neural feed system feeds from hidden layers to
RNNs. Various models of the memory unit such Figure 11: Sample feature description.
as long, temporary memory were extended, and
the recurrent gated unit can be used.
Pie chart distribution of normal and abnormal labels
(ix) Long Shorter Memory Time (LSTM). The RNN
gradient is addressed by LSTM. It will learn long-
Normal
term dependencies by utilizing the gate scheme.
Every LSTM device is equipped with a memory cell
75.99%
containing old states.
(x) Gated System Recurrent (GRU). Lightweight is the
GRU version of the LSTM. The architecture has
been streamlined, the doors merged, and the states
integrated. 24.01%
(xii) Linear Function (LF). Rightly called, it is a single Figure 12: Dataset distribution.
line multiplying the input by a constant multiplier.
(xiii) Nonlinear Function (NLF). Furthermore, the
nonlinear function is split into three subsets: curve
the data preprocessing phase where the null value-based
sigmoid, which is an S-shaped feature with several
columns were dropped. Further, the updated dataset was
zeros to 1. The S-shaped curve with a scale of -1 to
provided to feature selection and feature scaling phase. In
1 relates to the hyperbolic tangent (tanh).
this phase, important features were considered using the
Pearson correlation technique. Attack category used for the
3.2. Proposed Architecture. In this work, UNSW 2015 bench- analysis purpose is depicted in Figure 2. After obtaining
mark dataset was used. Initially, the dataset was analysed in the important features, the categorical features were
Wireless Communications and Mobile Computing 9
Sr. No. Algorithm Accuracy R2 score Precision Recall F1 score MAE MSE RMSE
1. XGboostClassifier 0.984 0.910 0.98 0.99 0.99 0.015 0.015 0.125
2. AdaboostClassifier 0.983 0.910 0.98 0.99 0.99 0.016 0.0164 0.128
3. ExtraTreeClassifier 0.983 0.910 0.98 0.99 0.99 0.016 0.016 0.128
4. Random Forest Classifier 0.986 0.924 0.99 0.99 0.99 0.013 0.013 0.117
Model loss
0.22
0.20
Loss
0.18
0.16
0.14
0 20 40 60 80 100
Epoch
Train
Test
Figure 13: Training validation accuracy of the proposed model for 100 epochs.
converted to the numeric features using the one-hot encod- tially, a deep neural network and proposed model were
ing technique. Later for scaling, the feature normalization implemented using 100 epochs. Finally, the proposed model
and standardization techniques were used. Finally, various has got more promising results when it was trained with
machine learning and deep learning algorithms were used 5000 epochs. The generalized flow chart of the proposed
for training the model. In the training phase, 80% of the model is depicted in Figure 3. The proposed system algo-
dataset was utilized, while for testing the model, remaining rithm has been represented in Algorithm 1 as follows:
20% of the dataset was used. The proposed autoencoder For identifying the highly correlated features in the
technique has outperformed compared to other models. Ini- dataset, Pearson correlation technique was used. The
10 Wireless Communications and Mobile Computing
Model Accuracy
0.92
0.91
0.90
Accuracy
0.89
0.88
0.87
0.86
0 20 40 60 80 100
Epoch
Train
Test
Figure 14: Training validation loss of the proposed model for 100 epochs.
correlated features were analysed through heatmap as (i) Maximize detection rate (DR)
provided in Figure 4.
For implementation purpose, ensemble-based machine true positive
learning techniques were considered. XGboost, Adaboost, Detection rate =
ðtrue positive + false negativeÞ
ExtraTreeClassifier, and Random Forest Classifier were
used for implementation purpose. Results obtained ð1Þ
through various machine learning algorithms are depicted
in Figures 5–8. (ii) Maximize accuracy (AC)
Figures 9 and 10 represent the model summary of deep
neural network and the proposed methodology, respectively.
false positive
False alarm rate =
ð false positive + true positiveÞ
4. Results and Discussions ð2Þ
4.1. Dataset Description and Preprocessing. In this work, the
UNSW-NB15 dataset was used for the implementation. This (iii) Minimize false alarm rate (FA)
dataset consists of a total of 175341 rows and 45 attributes.
In the dataset preprocessing step, initially, the null values
were dropped. Due to this, the dataset was converted into true positive + true negative
Maximize accuracy =
almost half of its size. Further, to handle the categorical fea- ðtrue negative + true positive + F N + false positiveÞ
tures, encoding techniques were used. Later for performing ð3Þ
feature scaling, the normalization technique was applied.
Figure 2 provides the details of the attack category available
Our model obtained higher accuracy compared to the
in the dataset. The sample of feature description is depicted
existing model as depicted in Table 2. The proposed model
in Figure 11. Dataset distribution after splitting in training
was tested on various performance metrics, and classifica-
and testing samples is depicted in Figure 12. For generating
tion accuracy was used as a comparison parameter with
efficient results of the proposed model, the implementation
the existing model.
is done with high configuration architecture which is com-
Table 3 provides the details of the results obtained using
prised of AMD RYZEN 9 processor with 8 cores, 64-bit
various machine learning algorithms. Results obtained using
Windows 10 OS, 16 GB of RAM, and 6 GB GTX 1660 TI
the deep learning algorithm and proposed methodology are
GPU. The complete model script was implemented on Jupy-
provided in Table 2. In Table 2, only results obtained from
ter Notebook tool using python programming language.
those researchers are considered who have used accuracy
as their comparative parameter.
4.2. Evaluation Metrics. The main aim of the evaluation met- In Figures 13 and 14, the accuracy results and loss
rics is to depict the implications of enriching the IDS with results of the proposed methodology computed for 100
the proposed model using some of the following important epochs are depicted. An appendix of all the acronyms is
parameters: given in Table 4.
Wireless Communications and Mobile Computing 11
5. Conclusion and Future Scope comparatively high accuracy than the existing system. Sev-
eral machine learning and deep learning approaches were
The proposed algorithm is trained using the SoftMax classi- used for implementation purposes. Further, to extend the
fier to identify the attack types in the dataset. The bench- work, stack-based autoencoder technique can be used for
mark dataset, UNSW-NB15, was used for training and reducing the computational resources. Also, more focus
testing the model. For training the hidden layers, there are can be given to optimizing the computational time.
many options, such as linear, SoftMax, sigmoid, and cor-
rected linear functionality which can be used as an activation Data Availability
function. A novel autoencoder technique was used for train-
ing and testing the model. The proposed model has achieved No data were used to support this study.
12 Wireless Communications and Mobile Computing