Managing The Fifth Generation (5G) Wireless Mobile Communication: A Machine Learning Approach For Network Traffic Prediction
Managing The Fifth Generation (5G) Wireless Mobile Communication: A Machine Learning Approach For Network Traffic Prediction
net/publication/361507819
CITATIONS READS
0 395
4 authors:
4 PUBLICATIONS 50 CITATIONS
Deutsches Forschungszentrum für Künstliche Intelligenz
26 PUBLICATIONS 74 CITATIONS
SEE PROFILE
SEE PROFILE
All content following this page was uploaded by Christoph Lipps on 24 June 2022.
Abstract
Recently, virtualization has been recognized due to its flexibility, convenience, and cost efficiency. With virtualized
infrastructures, comes automation of tasks and orchestration between different virtual elements in the network. In spite of saving
a considerable amount of task organization throughout the system, monitoring the network cannot be disregarded. Network traffic
offers different types of information that can be connected to network management challenges, such as, resource management
and anomaly detection. In this paper, traffic prediction is examined in relation to upcoming communication systems. Therefore, a
previously designed virtual 5G system is referred to as the source of the retrieved network traffic for training purposes. Results
of training a neural network for prediction are presented, in addition to suggestions where the designed model can further serve
for network management process.
Index Terms
Network Traffic, Time-Series Prediction, Deep Learning, Network Management, 5G
II. R ELATED W ORK has a simulated RAN and UE implemented. The core network
DL techniques are investigated considering their compatibil- is initialized for a connection to be initiated, and the RAN
ity to network analysis applications, or the so-called Network context is registered into the 5G core and prepared to accept a
Traffic Management and Analysis (NTMA) [1]. Besides, con- new device connection. The subscriber details of a device are
ducting a review of previous work on network analysis from saved into the database through web console. On initialization
four different perspectives: network traffic classification, fault of UE, an IP address is assigned, and the UE is registered
management, network traffic prediction, and network security, into the network and internet access is enabled. In addition to
i.e, anomaly detection or intrusion detection. While the work the existing framework, graphical visualization of the network
in [4] focused explicitly on active or real-time network an- traffic data is expressed through Grafana [10].
alytics and presented their basics with a survey of the use Due to the lack of publicly available datasets in relation
cases and suitable software platforms. Whereas, a novel DL- to 5G, traffic data generated for several months from full IGP
based approach was proposed in [5], combining both feature routing, BGP routing, and sampled Netflow data as part of the
extraction and classification of network traffic. [6] provided GEANT network [11] was used. For the overall evaluation of
a comprehensive survey about the role of ML techniques the performed network traffic prediction, the trained model is
in communication considering three layers of communication evaluated using a sample from the GEANT project’s public
systems: physical, access, and network layer. Additionally, dataset [12].
recent computing and networking concepts, such as, Multi-
access Edge Computing (MEC), Software Defined Networking III. T IME S ERIES P REDICTION
(SDN), and Network Function Virtualization (NFV) were Fundamentally, the learning process begins with a given
taken into consideration. input to each Neural Network (NN) in a ML model, by which
A ML-based framework for autonomous network manage- output vectors of the hidden layers are computed. The selected
ment in 5G systems was introduced in [7] providing Self- activation functions are then applied to each hidden layer,
Organizing Networks (SONs) that operate with the help of until finally their output is passed to the subsequent layers
intelligent network slicing. The process of achieving such as an input. Feed Forward Neural Networks (FFNNs) operate
smart framework begins with monitoring, collecting, extract- by passing each hidden layer’s output to the following layer
ing, and analyzing the network’s metrics using SDN/NFV in the direction of the output layer only (forward). On a
sensors. Aiming to derive network states and take actions when slightly advanced note, RNNs enable sharing of parameters
required, taking a step towards an autonomously-managed net- over different layers through the model. Hence, passing de-
work with self-healing, self-protection, and self-optimization pendencies between previous and future data instances through
as part of the SELFNET project. In [8] SDNs were investigated the network, which can be beneficial especially when handling
from a security point of view and use cases were proposed time-series data, where features like seasonality, trend, and
tackling Physical Layer Security (PhySec) approaches, such correlation can be found. Hence, RNNs assume that each
as scheduled traffic control and link quality monitoring. future instance of data depends on a window of previous data
points. The general form of RNN comes in different variations
as mentioned in Section I, here the general RNN, LSTM, and
bidirectional LSTM are discussed in terms of architecture and
Internet
compatibility to time-series data.
its impact fading away over time steps. However, in LSTM lected metric to time of monitoring, and finally the preparation
the state is stored in a memory cell and passed over through phase before the data can serve as an input to the trained
the whole lifetime of training if necessary. The hidden vector model.
h at time t is represented as follows [2]:
A. Data Generation
ht = ot tanh(ct ) (3)
In Section II, the infrastructure used for network data
generation was represented, which is also part of the work
Outputs h1 h2 ht in [3]. The network components of 5G core, RAN and UE are
initialized, and UE is subjected to perform various activities
such as ping and streaming of a high quality video for data
generation. Through the application of network monitoring
tools Grafana and Prometheus, the 5G core network traffic
States s1 s2 st data is collected and prepared for training a neural network for
traffic prediction. As this work aims to predict future traffic
volume, a specific metric of the traffic generating from the core
network is recorded, which is the total packets transmitted
Inputs x1 x2 xt in Mbits/sec. The amount of transmitted bits is specifically
recorded from the traffic for a total duration of 3 hours over
Fig. 2: Recurrent neural networks architecture. time steps of 5 seconds, resulting in around 900 values.
Where ct refers to the memory cell, and ot refers to the
output gate that determines the preserved amount of memory B. Data Preparation
from previous states and is represented accordingly: A pre-processing is required for the raw data as shown
in Figure 4, in order to be fed as an input for further
ot = f (Wxo xt + Who ht−1 + Wco ct + bo ) (4) training procedure as shown in Figure 5. First, the values
are normalized through dividing by the highest value in the
ht-1 ht ht+1 dataset, resulting with values on a scale from 0 to 1. Before
feeding the data into the model it is split into training and
validation datasets, training being 70% of the total amount of
data, while validation consists of 30% considering the size
Ct-1 Ct
of data [13]. Lastly, features and labels are defined being
responsible for enabling the ML model to carry out further
tanh
it ot predictions based on previous values. Since it is challenging
ft ct to define specific features and labels in a time-series data, in
σ σ tanh σ
ht-1 ht this case the features are based on a parameter referred to
as window size. A window size is defined consisting of a
number of previous data points as the feature, and the next
value right after the end of the window as the label. Leading
Xt-1 Xt Xt+1 to a windowed dataset according to a specified window size,
as shown in the right most step in Figure 4.
Fig. 3: Long short term memory architecture.
The memory cell ct gets updated according to the output V. R ESULTS AND D ISCUSSION
gate’s ot decision of forgetting part of its current memory Building and training the model is performed using Ten-
and adding a new one. Therefore, a previous data point can sorFlow and Keras libraries in Python and through a virtual
have a greater impact on the overall projection in LSTM than Anaconda environment. First, considering that the dataset is
in RNN. LSTM has been proved compatible with data that relatively small sized, a simple NN is built with two layers
contains high dependency between its points at different time for each architecture, each with 400 neurons, in addition to a
steps, such as, seasonal time-series, speech recognition, or Dense layer with only one neuron for output. An arbitrary
automatic translation [1]. Figure 3 represents the architecture choice of parameters: batch size, window size, number of
of LSTM with a detailed view of the memory unit that leads epochs, is made at first. While an automatic optimization of
to the output gate’s final decision at each instance. Whereas, the learning rate is performed through scheduling a callback
a bidirectional LSTM enables the state to be passed in both on every epoch with a different learning rate, and finding the
directions, forward and backwards, leading to an additional value with the lowest loss possible as illustrated in Figure 5.
impact from data points at a later time step to data points at After obtaining the learning rate’s optimum value, the model
an earlier one. is updated and a manual random optimization is performed for
the rest of the parameters, reaching final values as (batch size
IV. DATA C OLLECTION AND P REPARATION = 16, window size = 30, no. epochs = 200) that result with
This Section describes briefly the data generation process low loss and less loss fluctuation. The training was performed
and the specifications of data collection process, from the se- using RNN, LSTM, and bidirectional LSTM, considering
4
B. Deployment
identical parameters, except for the learning rate which was
optimized for each type independently. Finally, the model is In order to run the evaluation over the virtually simulated 5G
compiled with Mean Squared Error (MSE) as the loss function network, the designed ML model is deployed into a monitoring
and Stochastic Gradient Descent (SGD) as the optimizer. In stack that consists of a series of virtualized Graphics Process-
order to evaluate the performance of each technique, a naive ing Unit (GPU) cluster, managed and orchestrated through
approach of forecasting is performed to form a baseline that Slurm cluster. Data from live network traffic is collected
the built model is compared against. As the name suggests, and stored using Prometheus as time-series data metrics. The
the naive forecasting approach is simply based on considering metrics are sent periodically to the monitoring stack using
that a value xt at a particular time step should be equal to the secure transfer protocol with a time step of 60 minutes and are
value xt−1 at a previous time step. In addition to the naive further subjected to ML analysis. After retrieving the traffic,
forecasting, a Moving Average (MA) approach is performed, the model is loaded and evaluated using recent traffic update.
within which a value xt at a particular time step depends on Resulting with a similar behaviour on recently collected data
the average of values from xt−1 to xt−n at previous time regarding loss evaluation, where the overall loss was less than
steps [2]. The simplicity of these two methods makes them the baseline set by the NM.
sufficient as a baseline of how efficiently the developed model
is performing.
VI. C ONCLUSION AND O UTLOOK
A. Results Automation and orchestration are becoming more involved
Training the model with different architectures of NNs, a in the upcoming communication systems, which triggers the
prediction is held using validation dataset, which represents need for continuous monitoring of the network status. The
30% of the total amount of data and the loss is calculated and extensive awareness of network status can be beneficial for net-
represented as MSE values between the original and forecasted work management applications, such as resource management
values. MSE gives an indication of a well-performed model and load balancing, or for security related applications like
with good forecast when its value is closer to zero. Figure anomaly detection. This work aims for taking the first steps
6 illustrates the difference between original and forecasted in examining the role of ML techniques in communication
values for each trained NN. Figure 6a depicts that the forecast systems like 5G, particularly through predicting traffic volume
using RNN is less noisy and closer to the original values. through the network. A pre-built virtual 5G system is used
While Figures 6c and 6d provide a very similar forecast. for traffic generation and recording to analyze the efficiency
Figure 7 shows a brief comparison in terms of loss between of different DL architectures, such as RNN and LSTM, in
the forecasted and original values using different architectures, generating a time-series forecast. A ML model is built, trained,
where the naive approach is being referred to as Naive Method and evaluated through observation of the MSE value between
(NM) here, and the architecture with the best performance the original and forecasted values. Further aspiration is im-
and lowest MSE value is RNN. Figure 6b shows a closer proving the model by training on a larger dataset, achieving a
view of the forecasted values using RNN architecture. From continuous forecast from the retrieved system’s statistics, and
observing the traffic data, a lack of seasonality and trends examining the prediction’s impact on enhancing the decision-
throughout the traffic is noticed, which is the reason behind making process to strengthen the overall system performance.
Parameters
Optimization
random optimization
Fig. 5: The process of training the neural network for prediction, including optimization of hyperparameters.
5
(a) Recurrent Neural Networks (RNNs) (b) A closer view of the forecast using RNN.
[3] R. Reddy, S. Baradie, and C. Lipps, “Edge Computing and the Fifth
Generation (5G) Mobile Communication: A Virtualized, Distributed
System Towards a New Networking Concept,” in Proceedings of the
Workshop on Next Generation Networks and Applications (NGNA),
Kaiserslautern, Germany, 2021.
[4] S. Verma, Y. Kawamoto, Z. M. Fadlullah, H. Nishiyama, and N. Kato,
“ASurvey on Network Methodologies for Real-Time Analytics of
Massive IoT Data and Open Research Issues,” IEEE Communications
Surveys Tutorials, vol. 19, no. 3, pp. 1457–1477, 2017. DOI: 10.1109/
COMST.2017.2694469.
[5] M. Lotfollahi, M. Jafari Siavoshani, R. Shirali Hossein Zade, and
M. Saberian, “Deep packet: A novel approach for encrypted traffic
classification using deep learning,” Soft Computing, vol. 24, no. 3,
pp. 1999–2012, 2020. DOI: 10.1007/s00500-019-04030-2.
[6] I. Ahmad, S. Shahabuddin, H. Malik, E. Harjula, T. Leppänen, L.
Fig. 7: A comparison between the Mean Squared Error (MSE) Lovén, A. Anttonen, A. H. Sodhro, M. Mahtab Alam, M. Juntti, A.
values of different methods. Ylä-Jääski, T. Sauter, A. Gurtov, M. Ylianttila, and J. Riekki, “Machine
Learning Meets Communication Networks: Current Trends and Future
Challenges,” IEEE Access, vol. 8, pp. 223 418–223 460, 2020. DOI:
10.1109/ACCESS.2020.3041765.
ACKNOWLEDGMENT [7] W. Jiang, M. Strufe, and H. Schotten, “Machine Learning-Based
Framework for Autonomous Network Management in 5G Systems,”
This work has been supported by the Federal Ministry of in Proc. 2018 Eur. Conf. on Netw. and Commun.(EuCNC), Ljubljana,
Education and Research of the Federal Republic of Germany Slovenia, 2018.
(Förderkennzeichen 16KIS1320, AI-NET ANTILLAS). The [8] C. Lipps, D. Krummacker, and H. D. Schotten, “Securing Industrial
Wireless Networks: Enhancing SDN with PhySec,” in 2019 Conference
authors alone are responsible for the content of the paper. on Next Generation Computing Applications (NextComp), 2019, pp. 1–
7. DOI: 10.1109/NEXTCOMP.2019.8883600.
[9] Free5GC, https://www.free5gc.org/, Feb. 2022.
R EFERENCES [10] Grafana, https://grafana.com/.
[1] M. Abbasi, A. Shahraki, and A. Taherkordi, “Deep learning for [11] GEANT, https://geant.org/, Feb. 2022.
network traffic monitoring and analysis (NTMA): a survey,” Computer [12] S. Uhlig, B. Quoitin, J. Lepropre, and S. Balon, “Providing Public
Communications, vol. 170, pp. 19–41, 2021. DOI: 10.1016/j.comcom. Intradomain Traffic Matrices to the Research Community,” SIGCOMM
2021.01.021. Comput. Commun. Rev., vol. 36, no. 1, pp. 83–86, Jan. 2006. DOI:
[2] N. Ramakrishnan and T. Soni, “Network Traffic Prediction Using Re- 10.1145/1111322.1111341.
current Neural Networks,” in 2018 17th IEEE International Conference [13] Y. Xu and R. Goodacre, “On splitting training and validation set:
on Machine Learning and Applications (ICMLA), 2018, pp. 187–193. a comparative study of cross-validation, bootstrap and systematic
DOI : 10.1109/ICMLA.2018.00035. sampling for estimating the generalization performance of supervised
learning,” Journal of analysis and testing, vol. 2, no. 3, pp. 249–262,
2018. DOI: 10.1007/s41664-018-0068-2.