Traffic Flow Prediction Using Deep Learning Techniques
Traffic Flow Prediction Using Deep Learning Techniques
Traffic Flow Prediction Using Deep Learning Techniques
Techniques
Shubhashish Goswami1,3[0000-0002-6129-9822] and Abhimanyu Kumar2
1, 2
National Institute of Technology Uttarakhand
subh.goswami@gmail.com
3
Dev Bhoomi Uttarakhand University Dehradun
1 Introduction
The “Intelligent Transportation System (ITS) plays a critical role in real-
world traffic management and control, benefiting traffic safety, efficiency,
and congestion reduction, among other things” [1]. Accurate traffic flow pre-
dictions in the highway network offer critical data for ITS to make proactive
and effective traffic management choices [2]. Traffic congestion, traffic acci-
dents, and traffic delays, all of which are caused by the exponential rise in
the number of motor vehicles, offer significant difficulties and pressure to
the transportation process. The most efficient method to address traffic diffi-
culties is to create a dependable traffic management plan based on traffic
flow forecasts. The number of vehicles moving via the route at every fre-
quency slot is represented by the traffic flow [3].
As a result of the widespread use of traditional traffic sensors and the devel-
opment of traffic sensing devices, it finally facilitated the emergence of
large-scale data transportation [5]. Transportation planning and monitoring
are becoming increasingly data-driven [6]. While numerous traffic flow pre-
diction tools and algorithms exist, the majorities of them apply shallow traf-
fic concepts and are still unsatisfactory.
This paper proposes a TFP based on the DL technique. The deep learning
method has a high level of accuracy and can efficiently collect spatial infor-
mation. Deep learning techniques like “Stacked Auto-Encoder (SAE) [5],
Convolutional Neural Network (CNN) [8], and Long- and Short-Term Mem-
ory Neural (LSTM)”, can be used to predict traffic flow data. To extract gen-
eral traffic flow characteristics, a stacked autoencoder (SAE) method is es-
tablished, which is trained in a layer-wise rigorous way. It is the first time, to
the authors' knowledge, that the SAE method has been utilized to character-
ize traffic flow characteristics for prediction. “Spatial and Temporal Correla-
tions” are factored in the modeling process. CNN is a type of neural network
used to determine spatial correlation from grid-structured data given by im-
ages or videos.
CNN has been used to detect spatial correlations in traffic networks from 2D
Spatio-temporal traffic data in several studies. Due to the difficulty of de-
scribing the traffic network using 2D matrices, numerous studies have at-
tempted to transform the traffic network model at various periods into im-
ages and split these images into conventional grids, every grid defining a re-
gion. CNNs may be used to establish spatial characteristics between various
areas in this method [8].
The RNN approach is generally utilized for tasks that need sequential input.
RNNs use a single component input sequence and save performance data on
hidden parts which include all prior components' historical data effectively
[9]. The “Long Short-Term Memory (LSTM)” exhibits high accuracy in nat-
ural language processing [10] and may be used to solve a variety of time se -
ries problems. Zhao et al's LSTM-based traffic flow forecasting collects tem-
poral information about traffic flow over time. Because this technique solely
employs LSTM to analyze the data, it fails to capture spatial information
[11]. Various DNN models, on the other hand, specialize in different areas.
For example, CNN is better at capturing spatial dependencies in transporta-
tion networks; RNN and LSTM are better at capturing temporal dependen-
cies; LSTM may even identify long-term temporal correlation. SAE per-
forms better when it comes to extracting latent characteristics from raw data.
Because traffic flow varies depending on both temporal and geographical
data, a hybrid composition of several DNN models has become a common
technique to enhance traffic forecasting accuracy in recent decades [12].
2 Related Works
With the advancement of intelligent transportation over the last era, TFP has
become a major research area in the intelligent transportation field. Many
professionals and academics have focused their effort and time on traffic
flow prediction work, proposing a wide range of prediction approaches. To
explain the numerous aspects of fundamental traffic flow patterns, Refer-
ence [13] offers an efficient mixed time series forecasting technique that
combines the Auto-regressive Embedded Moving Average Structure with the
evolutionary programming methodology, [14] suggests an integrated TFP
method based on traffic flow time series multifractal features. The link be-
tween meteorological factors and traffic flow is studied in Reference [15],
and a unique overall framework to enhance traffic flow forecast is presented.
In addition, [16] presents a traffic prediction technique based on the DBN
conceptual framework and multi-task classification to forecast the traffic
flow of single and multi-task outputs. [17] proposes a deep code learning
technique that is used in the Macao efficient system.
For traffic flow prediction, reference [18] proposes the GRU NN algorithm.
In their experiment, they evaluated the predictive accuracy of ARIMA,
LSTM, and GRU algorithms and determined that LSTM NNs and GRU NNs
outperformed ARIMA. GRU NNs lowered MAE by 10% on average when
compared to ARIMA and 5% when compared to LSTM NNs. RNNs with
more hidden units will be examined in the next, and the variable length of
time sequence inputs may aid RNNs in automatically determining the best
time delays.
Integrated DL techniques for TFP have got a huge amount of focus in current
decades due to the quick rise of DL theories and applications, as well as the
spatial and temporal dependence aspects of traffic flow data. [12] examines
combined DL techniques for TFP in depth. It initially presents the different
data sources utilized in hybrid traffic flow prediction algorithms, before
moving on to the hybrid traffic flow prediction modeling techniques, which
range from basic to sophisticated. By examining current methods, one can
observe that hybrid methods for traffic flow prediction are getting increas-
ingly sophisticated to capture more information in transportation data. With
the growing gathering of finer-grained, multi-type data from transportation
networks, the hybrid learning algorithm will likely continue to improve in
the future to integrate more data characteristics, allowing for more accurate
and scalable traffic flow prediction.
The extensive mobility data and DL regarding traffic forecasts are discussed
in [9]. With effective unbiased feature representation, deep learning en-
hances traffic forecasts. DL theory-based models for traffic data were given
in “Large-Scale Transportation Network Congestion Development Forecast-
ing”. To avoid traffic delays on a huge transportation network, it is critical to
analyze congestion. Traditional congestion forecasting systems rely on static
data. With emerging technologies like “Intelligent Transportation Systems
(ITS) and the Internet of Things, transportation data is extremely pervasive
(IoT)”. Introduce a profound limited Boltzmann machine with an RNN de-
sign to predict traffic congestion [9]. The system design predicts traffic data
more precisely and accurately than current machine learning techniques. Be-
cause of the short-term and long-term patterns, predicting traffic statistics is
difficult. The prolonged phase of the LSTM model [10], which is a general
technique in deep learning, produces the estimation of short-term traffic data.
It outperforms other machine learning techniques currently in use.
3 Proposed work
Deep learning techniques may achieve higher performance by implementing
much more features and sophisticated frameworks than traditional ap-
proaches.
“An Auto-Encoder (AE) is a neural network that aims to replicate its input,
with the target output being the model's input. An autoencoder with one in-
put layer, one hidden layer, and one output layer” is represented in Figure 2.
Hidden Layer
b b
Figure 2: Auto-Encoder
d(a) = q( H 2b + j) (2)
We can get the set of parameters that are indicated as ε , by minimizing the
reconstructive error S(A, D).
1 F
2∑
2
ε =argε min S ( A , D )=¿ arg ε ¿ min ¿∨a (m )−D ( a( m) )∨¿
x=1
One significant difficulty with an autoencoder is that if the hidden layer is
similar in size as or greater than the input layer, the autoencoder might train
the identity function. Yet, recent work demonstrates that this is not an issue
if nonlinear AE contains more hidden units than the input or if additional
limitations like sparsity requirements are enforced [19]. “When sparsity re-
strictions are applied to the objective function, an AE is transformed into a
sparse AE, which takes into account the hidden layer's sparse representation.
We will use a sparsity constraint to reduce the reconstruction error in order
to obtain the sparse representation”.
AE
C = S (A,D) + β ∑ KL(μ∨¿ μ^ n) (3)
n=1
unit n over the training set, and KL (μ∨¿ ^μn ) is the Kullback–Leibler (KL)
divergence” is defined as,
1−μ
KL¿ + (1 - μ) log (4)
1−^μn
To use the SAE network for TFP, we must construct a conventional predictor
at the upper part. In this study, a logistic classification model is integrated on
edge of the system for supervised traffic flow prediction. The SAEs and the
predictor are the components of the complete deep architecture system for
traffic flow prediction.
“CNNs are a form of Feedforward Neural Network with a basic structure and
convolution computations”. It is one of the most well-known DL techniques.
CNNs were first studied in the 1980s and 1990s. The first convolutional neu-
ral networks were significant delay systems and LeNet-5; after the twenty-
first era, with the emergence of the DL approach and the advancement of
“Numerical Computing Equipment, Convolutional Neural Networks have
evolved significantly and are now used in Image Classification, Speech
Recognition, Natural Language Processing, and a variety of other fields”
[20].
“The Convolutional Layer, Pooling Layer, and Fully Connected Layer are
the 3 aspects of a CNN. The convolutional layer's primary role is feature ex-
traction”. The convolutional layer's, convolution kernel shape follows the
features of an animal's visual system as it observes objects, focusing on the
image's local information first. It could also effectively retrieve the data in
the receptive field's local spatial properties. The pooling layer is used for fea-
ture extraction and dimension reduction of the feature maps generated by the
convolutional layer; the fully connected layer is obtained directly by the con-
volutional layer and the pooling layer. To achieve the classification model or
regression aim, the multi-dimensional abstract feature is given as input and
vectorized into vector form. As a result, the features of CNN weight sharing
are used in this study to get the spatiotemporal features of traffic flow while
decreasing model complexity and computing complexity [20].
Consider the CNN architecture [21], which has two convolution layers and
two pooling layers.
1 1
r n = x n; (6)
1 1
z n = g (r n ); (7)
1 1
Where, “Con2(W, M n ,Valid ) + y n indicates a narrow convolution, and g
denotes the activation function”.
“P2: initial pooling, pooled frame for 2*2, size of feature map 28*28 pooled
into 14*14 pool mapping, total pool maps derived”:
2 2 1 2
pn = α n low ( z n)+ y n; (8)
r 2n = p2n; (9)
2 2
z n = g (r n ); (10)
r 3n = p3n; (12)
3 3
z n = g (r n ); (13)
“P4: Pooled once more, frame for 2*2, pooling a 10*10 feature map into a
5*5 pool map, resulting in a pool map of F3”:
4 4 3 4
pn = α n low ( z n)+ y n ; (14)
4 4
r n = pn ; (15)
4 4
z n = g (r n ); (16)
The input gate determines if the current feature is effective, the output gate
determines if the recent data is valuable, and the forget gate remembers the
earlier inaccurate data decision. Finally, the sequence z n (n=1,2,…, G 3) is
4
The weights of the hidden and output layer are defined by and V, respec-
tively, whereas the transition weights of the hidden level are symbolized by
W. The element-wise multiple of the input and the earlier network hidden
phase ht - 1 produces the network's hidden phase at time t. Equation 17 shows
the hidden state at time t. Figure 4 shows the flowchart of the LSTM model.
Kt = ∂ ( Gh x xt +W hh ht−1 + B )(17)
“Ghx is the weight between the input and recurrent hidden nodes, W hh is the
weight between the recurrent node and the previous time step of the hidden
node itself, and b and sigmoid activation are the bias and non-linear (sig-
moid) activation, respectively”. Although RNNs outperforms other algo-
rithms in time series prediction, they still have problems that need to be ad -
dressed.
Figure 4 shows the construction of the LSTM-NN with one memory block.
Input, output, and forget gates are present in the memory block, and they
provide the write, read, and reset functions on each cell, respectively. The
working mechanism of LSTM is shown in equations 18-22.
l t =k t ( y a 0 ) (22)
The LSTM model has been shown to be superior in order to achieve accu-
racy. The LSTM is an extensively used RNN that can reap the benefits of the
pattern of timing changes in time series. Due to RNN's gradient disintegra-
tion limits, LSTM implemented the notion of a cell, (i.e.) exactly the same
except inserting a block instead of a hidden layer inside the RNN. The input
gate analyzes whether the current feature is beneficial, the output gate as-
sesses whether the current information is valuable, and the forget gate takes
account of the previously entered invalid information [21].
Mean Square Error (RMSE): The “Square Root Error is defined as the
square root of the square and the measurement n ratio of the difference be-
tween the expected and actual values”. The variance is lower than the param-
eter [24].
√
Z
1
∑
2
RMSE = ( b x −b^x ) (24)
Z x=1
Mean Absolute Percentage Error (MAPE): The “average of the absolute
value of the departure of all individual observations from the arithmetic
mean [24], also defined as the absolute deviation of the average”. Because
the average absolute error avoids the problem of errors leveling out, it can
properly represent the amount of the real forecast mistake.
1 Z
b^ x −b^ x
MAPE =
Z ∑¿ bx
∨¿ ¿ (25)
x=1
√∑
Z
2
(b ¿¿ x −b) b^ x −b^ ¿
2
x=1
4.3. Results
This study examines the performance of the suggested Deep learning ap-
proaches to those of SAE, CNN, and LSTM. SAE, CNN, and LSTM param-
eter values are comparable to those seen in state-of-the-art. Figures 5 illus-
trate the predicted outcomes. Convolutional neural networks are better at ex-
tracting spatial data than they are at retrieving temporal features. A well-
known time series ML approach, the LSTM was primarily utilized to remove
temporal characteristics. To reduce noise and detect missing data, a stacked
denoise autoencoder is utilized [25]. Missing values in the provided data
have an impact on the results of additional estimation and forecasting. For
road traffic, each precision of forecast results is critical since it saves acci -
dents, collisions, and further congestion. Considering missing data improves
prediction accuracy in all approaches, according to the results. “MAEs,
MREs, and RMSEs” are used to calculate the differences between forecast
and actual values. Because researchers make predictions at many locations
and times, the forecasting accuracy of spatial and temporal distributions is
highly vital. To evaluate the performance of spatial and temporal distribution
predicting, a coefficient correlation (R) is defined.
20 12
10
15
Error Rate
Error Rate
8
10 6
4
5 MAE 2 MAPE
0 0
SAE CNN LSTM SAE CNN LSTM
Prediction Model
Prediction Model
1
Error Rate in
Percentage
20
Error Rate
0.9
10
RMSE 0.8 R
0 SAE CNN LSTM
SAE CNN LSTM
Prediction Model
Prediction Model
Table 2 also describes the following evaluation indicators. As seen in the ta-
ble, the individual SAE approach prediction result is the weakest of the
three, with higher error slots and a poor fit effect. The CNN algorithm has a
prediction effect that is similar to that of a simple neural network approach
and is somewhat better than the SAE technique. The prediction result of the
LSTM approach is the best of all suggested deep learning techniques since it
has the least MAE and RMSE, as well as the most well-fitting R-value.
It can be shown that the SAE predicted value is generally constant, and that
the prediction is most accurate when the fluctuation is not great, but that the
performance is low during the significant fluctuation and peak time. The
“CNN technique is more accurate in the prediction of stable time series and
peak hours, but not in the terms of low fluctuations. The LSTM technique is
more appropriate in the prediction of stable time series and peak hours, and
can well fit the real-time series of traffic flow while maintaining high predic-
tion accuracy”.
SAE 79.25
CNN 83.27
LSTM 87.95
adfa, p. 15, 2011.
Accuracy
90
85
Percentage
80
75
70
SAE CNN LSTM
Prediction Model
5 Conclusion
This paper proposed a Traffic Flow Prediction approach by DL techniques.
Data is gathered from several sources, and criteria are chosen. The suggested
algorithm's performance is measured using metrics such as RMSE, MAPE,
MAE, and R. For data collecting, the proposed approaches employ a multi-
modal framework. A deep learning model is used to make a prediction for
each data source. To extract the spatial information of neighboring cross-
roads, the method incorporates CNN. The LSTM collects the traffic flow's
time-series information, which it then integrates with the extracted spatial
and time-series information in the SAE technique for prediction. The results
show that the LSTMs approach has a greater prediction accuracy than a sys-
tem that simply examines time or space characteristics, and thus can accu-
rately represent changing traffic flow conditions.
References
[1] Chen, X., Chen, H., Yang, Y., Wu, H., Zhang, W., Zhao, J., & Xiong, Y. (2021).
Traffic flow prediction by an ensemble framework with a data denoising and
deep learning model. Physica A: Statistical Mechanics and its Applications, 565,
125574.
[2] Do, L. N., Vu, H. L., Vo, B. Q., Liu, Z., & Phung, D. (2019). An effective spa-
tial-temporal attention-based neural network for traffic flow prediction. Trans-
portation research part C: emerging technologies, 108, 12-28.
[3] Li, Y., Chai, S., Ma, Z., & Wang, G. (2021). A Hybrid Deep Learning Frame-
work for Long-Term Traffic Flow Prediction. IEEE Access, 9, 11264-11271.
[4] Yu, B., Song, X., Guan, F., Yang, Z., & Yao, B. (2016). k-Nearest neighbor
model for multiple-time-step prediction of short-term traffic condition. Journal
of Transportation Engineering, 142(6), 04016018.
[5] Lv, Y., Duan, Y., Kang, W., Li, Z., & Wang, F. Y. (2014). Traffic flow predic-
tion with big data: a deep learning approach. IEEE Transactions on Intelligent
Transportation Systems, 16(2), 865-873.
[6] Chen, C. P., & Zhang, C. Y. (2014). Data-intensive applications, challenges,
techniques, and technologies: A survey on Big Data. Information sciences, 275,
314-347.
[7] Hinton, G. E., &Salakhutdinov, R. R. (2006). Reducing the dimensionality of
data with neural networks. science, 313(5786), 504-507.
[8] Yin, X., Wu, G., Wei, J., Shen, Y., Qi, H., & Yin, B. (2021). Deep learning on
traffic prediction: Methods, analysis, and future directions. IEEE Transactions
on Intelligent Transportation Systems.
[9] B.Karthika, N.UmaMaheswari, R.Venkatesh. (2019). A Research of Traffic Pre-
diction using Deep Learning Techniques.International Journal of Innovative
Technology and Exploring Engineering (IJITEE), Volume-8, Issue- 9S2, July
2019.
[10] Zhao, R., Yan, R., Wang, J., & Mao, K. (2017). Learning to monitor machine
health with convolutional bi-directional LSTM networks. Sensors, 17(2), 273.
[11] Saravanan, S., &Venkatachalapathy, K. A Deep Hybrid Model for Traffic Flow
Prediction using CNN-rGRU.
[12] Shi, Y., Feng, H., Geng, X., Tang, X., & Wang, Y. (2019, November). A survey
of hybrid deep learning methods for traffic flow prediction. In Proceedings of
the 2019 3rd international conference on advances in image processing (pp.
133-138).
[13] Xu, C., Li, Z., & Wang, W. (2016). Short-term traffic flow prediction using a
methodology based on autoregressive integrated moving average and genetic
programming. Transport, 31(3), 343-358.
[14] Zhang, H., Wang, X., Cao, J., Tang, M., & Guo, Y. (2018). A hybrid short-term
traffic flow forecasting model based on time series multifractal characteris-
tics. Applied Intelligence, 48(8), 2429-2440.
[15] Koesdwiady, A., Soua, R., &Karray, F. (2016). Improving traffic flow predic-
tion with weather information in connected cars: A deep learning ap-
proach. IEEE Transactions on Vehicular Technology, 65(12), 9508-9517.
[16] W. Huang, G. Song, H. Hong, and K. Xie, ‘‘Deep architecture for traffic flow
prediction: Deep belief networks with multitask learning,’’ IEEE Trans. Intell.
Transp. Syst., vol. 15, no. 5, pp. 2191–2201, Oct. 2014.
[17] Li, D., Deng, L., Cai, Z., & Yao, X. (2018). Notice of retraction: intelligent
transportation system in Macao based on deep self-coding learning. IEEE Trans-
actions on Industrial Informatics, 14(7), 3253-3260.
[18] Rui Fu, Zuo Zhang, and Li Li. (2016). Using LSTM and GRU Neural Network
Methods for Traffic Flow Prediction.31st Youth Academic Annual Conference of
Chinese Association of Automation Wuhan, China; November 11-13, 2016.
[19] Palm, R. B. (2012). Prediction as a candidate for learning deep hierarchical mod-
els of data. Technical University of Denmark, 5.
[20] Jiang, L. (2020, September). Traffic Flow Prediction Method Based on Deep
Learning. In Journal of Physics: Conference Series (Vol. 1646, No. 1, p.
012050). IOP Publishing.
[21] Chen, X., Xie, X., & Teng, D. (2020, June). Short-term Traffic Flow Prediction
Based on ConvLSTM Model. In 2020 IEEE 5th Information Technology and
Mechatronics Engineering Conference (ITOEC) (pp. 846-850). IEEE.
[22] Wang, Z., & Pyle, T. (2019). Implementing a pavement management system:
The Caltrans experience. International Journal of Transportation Science and
Technology, 8(3), 251-262.
[23] Chen, X., Chen, H., Yang, Y., Wu, H., Zhang, W., Zhao, J., & Xiong, Y. (2021).
Traffic flow prediction by an ensemble framework with a data denoising and
deep learning model. Physica A: Statistical Mechanics and its Applications, 565,
125574.
[24] Hassan, A., & Mahmood, A. (2017, April). Deep learning approach for senti-
ment analysis of short texts. In 2017 3rd international conference on control,
automation, and robotics (ICCAR) (pp. 705-710). IEEE.
[25] Yang, B., Sun, S., Li, J., Lin, X., & Tian, Y. (2019). Traffic flow prediction us-
ing LSTM with feature enhancement. Neurocomputing, 332, 320-327.