1 s2.0 S1877050920300892 Main
1 s2.0 S1877050920300892 Main
1 s2.0 S1877050920300892 Main
com
Available online at www.sciencedirect.com
1 Available online at www.sciencedirect.com
1
ScienceDirect
Procedia Computer Science 00 (2019) 000–000
Procedia Computer Science 00 (2019) 000–000 www.elsevier.com/locate/procedia
Procedia Computer Science 165 (2019) 252–258 www.elsevier.com/locate/procedia
Abstract
Abstract
This paper presents a computer vision based framework with the aim of aiding the task of driving. The framework serves the
This paper presents a computer vision based framework with the aim of aiding the task of driving. The framework serves the
purpose of road analysis. Road analysis is further divided into two sub-tasks. The first task aims at recognition of the different
purpose of road analysis. Road analysis is further divided into two sub-tasks. The first task aims at recognition of the different
road signs, the second task aims at lane analysis. The task of automatic driving requires humans to multitask and perform many
road signs, the second task aims at lane analysis. The task of automatic driving requires humans to multitask and perform many
operations in split seconds. The framework is introduced to aid this task of driving if not completely automate it while keeping in
operations in split seconds. The framework is introduced to aid this task of driving if not completely automate it while keeping in
mind of using it with a simple hardware and software setup. The effectiveness of the framework lies in its feature of having minimal
mind of using it with a simple hardware and software setup. The effectiveness of the framework lies in its feature of having minimal
complexity which enables it to be used in real-time. The results of the pipeline are quantified by first measuring its accuracy in
complexity which enables it to be used in real-time. The results of the pipeline are quantified by first measuring its accuracy in
the classification of road signs, second measuring its ability to gather the information about the road (lane analysis and 2 vehicle
the classification of road signs, second measuring its ability to gather the information about the road (lane analysis and 2 vehicle
detection) thirdly by performing the time bench-marking.
detection) thirdly by performing the time bench-marking.
©
c 2019
2019TheTheAuthors.
Authors.Published
PublishedbybyElsevier
ElsevierB.V.
B.V.
c 2019 The Authors. Published by Elsevier B.V.
This
This is
is an
an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
(http://creativecommons.org/licenses/by-nc-nd/4.0/)
This is an open access
Peer-review article under thescientific
CC BY-NC-ND license the(http://creativecommons.org/licenses/by-nc-nd/4.0/)
Peer-review under
underresponsibility
responsibilityofofthe
the scientificcommittee ofof
committee INTERNATIONAL
the INTERNATIONAL CONFERENCE
CONFERENCE ONONRECENT
RECENT TRENDS IN
TRENDS
Peer-review under
ADVANCED responsibility
COMPUTING of the scientific committee of the INTERNATIONAL CONFERENCE ON RECENT TRENDS
2019.
IN ADVANCED COMPUTING 2019.
IN ADVANCED COMPUTING 2019.
Keywords: Computer Vision, Image Processing, Vehicle Detection, Convolution Neural Network,Transfer Learning
Keywords: Computer Vision, Image Processing, Vehicle Detection, Convolution Neural Network,Transfer Learning
1. Introduction
1. Introduction
Driving is an amalgamation of different complex tasks and requires un-deviated attention from the driver. Long
Driving is an amalgamation of different complex tasks and requires un-deviated attention from the driver. Long
stretches of driving can become really exhaustive and might cause a lack of attentiveness may lead to accidents. Due
stretches of driving can become really exhaustive and might cause a lack of attentiveness may lead to accidents. Due
to the recent advancement in image processing and deep learning techniques frameworks can be developed with the
to the recent advancement in image processing and deep learning techniques frameworks can be developed with the
aim to aid in the process of driving. The figure 1 describes a framework capable of road signs recognition and analysis
involving lane and vehicle detection. The road sign recognition consists of a deep learning architecture while road
analysis is achieved by a combination Utkarsh Shukla etimage
of different al. / Procedia Computer
processing Science 165
pipelines. For(2019) 252–258
training our model for recognition 253
2
of road signs we incorporate aUtkarsh
huge Shukla
varietyet al. Procedia Computer Science 00 (2019) 000–000
of /traffic signs used by different road work departments. This is done
to
aimachieve a generalized
to aid in recognition
the process of model.
driving. The figureTraining models
1 describes on such acapable
a framework huge variety
of roadofsigns
data recognition
cause a reduction in the
and analysis
accuracy, so the deep learning architecture is build to encounter a variety of traffic signs like performed
involving lane and vehicle detection. The road sign recognition consists of a deep learning architecture while road in [1] [2].The
next taskisaims
analysis at theby
achieved detection of different
a combination vehiclesimage
of different that are in motionpipelines.
processing on the roadForand analyze
training ourthe lanefor
model of recognition
the moving
vehicle.
of road signs we incorporate a huge variety of traffic signs used by different road work departments. This is done
to achieve a generalized recognition model. Training models on such a huge variety of data cause a reduction in the
accuracy, so the deep learning architecture is build to encounter a variety of traffic signs like performed in [1] [2].The
next task aims at the detection of different vehicles that are in motion on the road and analyze the lane of the moving
vehicle.
2. Literature Survey
Considerable amount of work has been done towards driver assisting technologies. Some recent work include a
framework with the aim of detection, tracking, and recognition of different road signs. For detection of the road sign
a2.hear-cascade
Literature Surveybased system coupled with adaboost was used [3]. The classification of the detected road signs was
performed using a Bayesian classifier The main focus of this paper is a joint modeling of colour and shape within the
AdaBoost framework.
Considerable amount In of
thework
paperhas Robust
been method
done towards for road sign assisting
driver detectiontechnologies.
and recognition a three
Some recentfold framework
work include isa
proposed. The first part aims at detection of road signs by leanings from some prior
framework with the aim of detection, tracking, and recognition of different road signs. For detection of the road sign information of the scenario or
colour. The second part focuses on geometric analysis of the detected edges in the
a hear-cascade based system coupled with adaboost was used [3]. The classification of the detected road signs was first step, which generates candidates
to be circular
performed anda triangular
using signs. The
Bayesian classifier Thelastmain
aspect focus of of
thethis
framework
paper is aisjoint
a recognition
modeling phase thatand
of colour performs validation
shape within the
using cross-correlation methods. [4] The post processing part of the framework
AdaBoost framework. In the paper Robust method for road sign detection and recognition a three fold framework is consists of time based integration
based
proposed.on Kalman
The firstfiltering.
part aims Inat[5]detection
Like [3]offocusroadissigns heavily put on using
by leanings from features
some prior based on colourofand
information theshape
scenarioof the
or
traffic signs for detection and identification of the road signs. They used Support vector
colour. The second part focuses on geometric analysis of the detected edges in the first step, which generates candidates machines as a classifier for
identification
to be circular of and thetriangular
different road
signs.signs.
The There recognition
last aspect process canisbea divided
of the framework into three
recognition phaseparts. First discrimination
that performs validation
according to the color of the pixel second traffic-sign detection by shape
using cross-correlation methods. [4] The post processing part of the framework consists of time based classification using support vectorintegration
machines;
and
based theonthird context
Kalman understanding
filtering. In [5] Like based
[3] on Gaussian-kernel
focus is heavily putSVMs. on using Thefeatures
quoted based
resultson puts lightand
colour on high
shapesuccess
of the
rate and a very low amount of false positives in the final recognition stage. Wen-Jia
traffic signs for detection and identification of the road signs. They used Support vector machines as a classifier Kuo and Chien-Chung Lin in [6]
for
proposed a two
identification ofstep framework
the different road forsigns.
detection
Thereand recognition
recognition of thecan
process road besigns.
divided The first
into stepparts.
three is a detection step which
First discrimination
uses Hough
according to transformation,
the color of the pixelcornersecond
detection, and projection
traffic-sign detectionto byestimate the locationusing
shape classification of thesupport
road sign
vector in machines;
the image
under noisy and complicated environment. The recognition task uses tree based
and the third context understanding based on Gaussian-kernel SVMs. The quoted results puts light on high success approach where convolution, (RBF)
deep neural network and K-dtree are implimented to recognise the road signs in two
rate and a very low amount of false positives in the final recognition stage. Wen-Jia Kuo and Chien-Chung Lin in [6] stages.
In [7] aa two
proposed newstep traffic sign detection
framework systemand
for detection wasrecognition
proposed of that
thesimultaneously
road signs. Theestimates
first step the
is a location
detectionand stepprecise
which
boundary of traffic signs using convolutional neural network (CNN). It solves the
uses Hough transformation, corner detection, and projection to estimate the location of the road sign in the image problem of most of the method that
only provide bounding boxes of traffic signs as output, and hence requires
under noisy and complicated environment. The recognition task uses tree based approach where convolution, (RBF)processes such as contour estimation or
image segmentation to obtain the precise boundary of signs. In
deep neural network and K-dtree are implimented to recognise the road signs in two stages. [8] Songwen Pei1, Fuwu Tang1, Yanfei Ji1, Jing Fan1
andInZhong
[7] a Ning proposed
new traffic signthe use of Multiscale
detection system wasDeconvolution
proposed thatNetworkssimultaneouslyto solveestimates
the problem of the huge
the location and amount
precise
of time taken for preprocessing the images and applying complicated algorithms
boundary of traffic signs using convolutional neural network (CNN). It solves the problem of most of the method for improving and finding blurred
that
and
onlysubpixel
provide images
bounding of the
boxessigns. Multi-Scale
of traffic signs as Deconvolution
output, and hence Networks (MDN),
requires smoothly
processes suchconflates
as contour multi-scale
estimation conv
or
nets
image with deconvolution
segmentation network,
to obtain resulting
the precise in an efficient
boundary of signs. andInrobust localizedPei1,
[8] Songwen traffic
FuwusignTang1,
recognition
Yanfeimodel training.
Ji1, Jing Fan1
In
and[9] is presented
Zhong a road signs
Ning proposed the userecognition and classification
of Multiscale Deconvolution system focused
Networks to on a three-step
solve the problem algorithm consisting
of the huge amount of
color segmentation, shape identification, and a deep neural network architecture.
of time taken for preprocessing the images and applying complicated algorithms for improving and finding blurred The ultimate aim of the algorithm is
to recognise and distinguish varied road signs present along Italian roads. The
and subpixel images of the signs. Multi-Scale Deconvolution Networks (MDN), smoothly conflates multi-scale conv system is proposed to achieve real time
application. The shape detection
nets with deconvolution network, is achieved
resulting in by using two
an efficient anddifferent model oftraffic
robust localized pattern matching
sign recognitionand model
the other using
training.
edge detection and geometrical cues. Radu Timofte Karel Zimmermann Luc Van
In [9] is presented a road signs recognition and classification system focused on a three-step algorithm consisting ofGool in [10] proposed a multiview
traffic sign detectionshape
color segmentation, which uses a multidimentional
identification, and a deep neural algorithm
networkto augment
architecture. results
Thebeyond
ultimatetheaim state-of-the-art.
of the algorithm Theis
focus was to shift the detection method which still focuses on single view detection.
to recognise and distinguish varied road signs present along Italian roads. The system is proposed to achieve real time A speedup in the process is
achieved
application. through
The shapea novel boundedis evaluation
detection achieved by of using
ensemble two AdaBoost
different modeldetectors. The 2D
of pattern detection
matching in the
and multiple
other views
using
are combined to estimte a 3D hypotheses.
edge detection and geometrical cues. Radu Timofte Karel Zimmermann Luc Van Gool in [10] proposed a multiview
traffic sign detection which uses a multidimentional algorithm to augment results beyond the state-of-the-art. The
focus was to shift the detection method which still focuses on single view detection. A speedup in the process is
achieved through a novel bounded evaluation of ensemble AdaBoost detectors. The 2D detection in multiple views
are combined to estimte a 3D hypotheses.
254 Utkarsh Shukla et al. / Procedia Computer Science 165 (2019) 252–258
Utkarsh Shukla et al. / Procedia Computer Science 00 (2019) 000–000 3
Figure 1: Framework capable of road signs recognition and analysis involving lane and vehicle detection
3. Data Descriptions
For the purpose of detection of signals from the roadside, the model is trained over standard images of differ-
ent Traffic signals. The dataset used is the German Traffic Sign Benchmark used in the paper . The dataset has the
following features
• The dataset contains in total 43 different categories of road sign which are all used for recognition
• More than 50,000 images in total, depicting ground-truth data.
• Physical traffic sign instances are unique within the dataset ( each real-world traffic sign only occurs once).
For road analysis, KITTI Vision benchmark dataset is used. The dataset was collected by equipping a standard station
wagon with two high-resolution color and gray scale video cameras. Dataset was collectied by driving around the
mid-size city of Karlsruhe, in rural areas and on highways.
4. Proposed Method
The framework has been divided into two objectives first of recognizing road signs, the second part is the analysis
of the road. For meeting the first objective we dive into Deep Learning approach of Convolution Neural Network.
We first compute our results over different architectures that are popularly used and then finalized over using transfer
learning for this purpose. The concept of transfer learning is been implemented in the domain of deep learning where
models trained for a task is re-purposed on a second related task. We use ResNet50 architecture present on Keras as
our base model for final implementation purpose. We customize this model by hyper-parameters optimization and
adding custom layers to use it with our use-case. The model displayed is just the customized architecture attached to
Utkarsh Shukla et al. / Procedia Computer Science 165 (2019) 252–258 255
4 Utkarsh Shukla et al. / Procedia Computer Science 00 (2019) 000–000
the pre-trained Resnet model. The second objective has to first deal with the main task of road analysis as it plays a
vital role in roadside vehicle detection while traveling. This process involves two sub-processes of:
• Analysing different lanes of roads and identifying the current lane so it can be used for determining future
maneuver
• Tracking the vehicles and objects around the car.
At first, the focus is on finding the track on which our vehicle is running or the track which is vacant that is having
no vehicles in front. Then we aim at detecting the lanes in a video frame with cars present on them.
5. Experimental Setup
This section gives a detailed information about the experimental setup involved in the
Utkarsh Shukla et al. / Procedia Computer Science 165 (2019) 252–258 257
performed to eliminate any noisy pixels which does not constitute the edge. This is used to highlight the edges of the
different lane of roads
After training the proposed deep learning architecture for road sign detection for 500 epochs as shown in 3 we get
an accuracy of 98.21% with a precision of 93.94%. The lane detection workflow was able to detect lanes, objects and
vehicles and in both still images and moving video frame figure 5 with total time computation of 64 frames per second
with a complexity of
O(n2 ) (3)
The model after it has been trained is predicts a batch of 128 images in 3secs. The model size ranges from 56.8 MB to
57.6 MB based on the amount of data used to train it. Since the size of the model is pretty less it is easily deployable
258 Utkarsh Shukla et al. / Procedia Computer Science 165 (2019) 252–258
Utkarsh Shukla et al. / Procedia Computer Science 00 (2019) 000–000 7
Figure 5: Results
and the lane detection uses only matrix operations which are heavily optimized to perform on low level hardware. If
GPUs can be used on the vehicles algorithms like MAGMA can hugely reduce the time and computational expenses
for lane and object detection.
7. Future Work
The future of this work consists of adding modularity to the implementation of the different objectives, setting of
better hardware compatible coding standards. Also we have only proposed the algorithm for multi-lane roads which
suggests the driver to be on the same lane. There are many roads which are single lanes and the traffic is from both the
sides on the same lane, specially in India, therefore an improvement on single lane roads is required. This task may
involve various new complexities. Also we can extend this work to bad weather conditions where its hard to predict
the lane while driving.
References
[1] B. Huval, T. Wang, S. Tandon, J. Kiske, W. Song, J. Pazhayampallil, M. Andriluka, P. Rajpurkar, T. Migimatsu, R. Cheng-Yue, et al., “An
empirical evaluation of deep learning on highway driving,” arXiv preprint arXiv:1504.01716, 2015.
[2] C. Chen, A. Seff, A. Kornhauser, and J. Xiao, “Deepdriving: Learning affordance for direct perception in autonomous driving,” in Proceedings
of the IEEE International Conference on Computer Vision, pp. 2722–2730, 2015.
[3] C. Bahlmann, Y. Zhu, V. Ramesh, M. Pellkofer, and T. Koehler, “A system for traffic sign detection, tracking, and recognition using color,
shape, and motion information,” in IEEE Proceedings. Intelligent Vehicles Symposium, 2005., pp. 255–260, IEEE, 2005.
[4] G. Piccioli, E. De Micheli, P. Parodi, and M. Campani, “Robust method for road sign detection and recognition,” Image and Vision Computing,
vol. 14, no. 3, pp. 209–223, 1996.
[5] S. Maldonado-Bascón, S. Lafuente-Arroyo, P. Gil-Jimenez, H. Gómez-Moreno, and F. López-Ferreras, “Road-sign detection and recognition
based on support vector machines,” IEEE transactions on intelligent transportation systems, vol. 8, no. 2, pp. 264–278, 2007.
[6] W.-J. Kuo and C.-C. Lin, “Two-stage road sign detection and recognition,” in 2007 IEEE international conference on multimedia and expo,
pp. 1427–1430, IEEE, 2007.
[7] H. S. Lee and K. Kim, “Simultaneous traffic sign detection and boundary estimation using convolutional neural network,” IEEE Transactions
on Intelligent Transportation Systems, vol. 19, no. 5, pp. 1652–1663, 2018.
[8] S. Pei, F. Tang, Y. Ji, J. Fan, and Z. Ning, “Localized traffic sign detection with multi-scale deconvolution networks,” in 2018 IEEE 42nd
Annual Computer Software and Applications Conference (COMPSAC), vol. 1, pp. 355–360, IEEE, 2018.
[9] A. Broggi, P. Cerri, P. Medici, P. P. Porta, and G. Ghisio, “Real time road signs recognition,” in 2007 IEEE Intelligent Vehicles Symposium,
pp. 981–986, IEEE, 2007.
[10] R. Timofte, K. Zimmermann, and L. Van Gool, “Multi-view traffic sign detection, recognition, and 3d localisation,” Machine vision and
applications, vol. 25, no. 3, pp. 633–647, 2014.