Pothole Detection and Dimension Estimation by Deep
Pothole Detection and Dimension Estimation by Deep
Environmental Science
1
sasankchadalavada@gmail.com; 2 tnteja@gmail.com
Abstract. Maintenance of roads is a crucial part after the construction of roads in order to
improve its design life. Without proper maintenance, deterioration occurs more rapidly out of
which potholes are the most common type of road distress that can pose a significant hazard to
passengers and vehicles. In order to improve road maintenance, automated systems contribute to
improving road safety and reducing infrastructure costs. In this paper one such automated
pothole detection system is used by applying CNN (Convolution Neural Network) a deep
learning approach with the object detection YOLO (You Only Look Once) to detect potholes in
real time. The proposed model used here is trained from scratch on a large pothole dataset with
an epochs value of 200, and is validated and tested on custom made dataset. The trained model
provided accurate results with an mAP50 of 92% in detection of potholes. Further, an image
processing method based on spatial resolution factor is used for dimension estimation of the
potholes. The findings of this study assist in the inspection of non-destructive automatic
pavement conditions that also contributes in improving road safety and reducing the time and
cost required for road maintenance.
1. Introduction
Road maintenance plays a vital role in ensuring the safety, efficiency, and longevity of transportation
infrastructure. It involves regular inspection, repair, and upkeep of roads, highways, and related
components to address wear and tear caused by traffic, weather conditions, and other factors.
Without proper maintenance of roads, degradation manifests itself more rapidly in the form of
pavement distresses like potholes, cracks, rutting, bleeding, ravelling etc. These pavement distresses can
have several negative effects on various aspects of transportation and cause vehicle damage, traffic
congestion thereby increasing travel times and also tend to rise in accidents. This inefficiency not only
impacts individual commuters but also hampers the movement of goods and services, affecting overall
productivity and economic growth. That being the case, importance of road maintenance cannot be
overstated, as it directly impacts various aspects of society, including safety and overall quality of life.
Investing in regular and proactive road maintenance programs is essential to mitigate the effects of
pavement distress in order to ensure sustainable transportation systems.
Content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution
of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
Published under licence by IOP Publishing Ltd 1
CISCE-2023 IOP Publishing
IOP Conf. Series: Earth and Environmental Science 1326 (2024) 012100 doi:10.1088/1755-1315/1326/1/012100
Out of the many pavement distresses caused due to improper road maintenance. Potholes are the
most common type of pavement distress. The formation and progression of potholes are influenced by
several factors, including, improper design of the pavement, selection of inferior pavement materials,
poor construction and maintenance practices. And even after the construction of pavement the heavy
traffic loading and harsh climatic conditions lead to potholes. These potholes pose a greater threat to life
and vehicles by sudden jolts or tire blowouts, suspension damage, and misalignment. Repairing such
damage can be costly for vehicle owners, leading to unexpected increase in vehicle operating costs.
Unforeseen encounters with deep potholes can result in drivers to lose control of their vehicles, resulting
in accidents and potential injuries to individuals. According to the Road Accidents in India 2021 [1]
report by MoRTH, potholes account for nearly 0.8 percent of total road accidents, 1.4 percent of road
accident deaths and 0.6 percent of injuries. And comparatively there is a rise of 1.7 percent in the number
of accidents, 0.7 percent rise in fatalities, 1.3 percent rise in injuries compared to 2020 as shown in
Figure 1. This makes road maintenance a paramount importance. The history of road maintenance dates
back thousands of years. In the late 20th century, the concept of road asset management gained
prominence taking into account the entire lifecycle of road assets. Asset management approaches
emphasised data collection, condition assessments, prioritisation, and optimization of maintenance. This
led to visual inspections performed by certified inspectors or structural engineers using manual process
(Network Survey Vehicle) for collecting the features of the pavement distresses which consumes a lot
of time, labour and money. Thus, there is a need for an autonomous system that can reduce time, labour
and money. In this paper, one such automated pothole detection system is proposed by deep learning
and image processing techniques. This proposed custom-made system is built from scratch by applying
CNN (Convolutional Neural Network) with the state-of-the-art object detection YOLO (You Only Look
Once) to detect potholes.
Christian Koch, Ioannis Brilakis [3] proposed a method for detection of potholes by attaching a fish eye
camera onto the rear bumper of the car in order to overcome the traditional methods. The model is
implemented in MATLAB with a dataset of 120 pavement images of pavement distress like potholes
and cracks at different lighting conditions. The images for training and testing purpose are of 70 images
and 50 images respectively, captured manually from fish eye camera of vehicle at 8 to 15 mph around
Georgia. The data is then divided into regions with and without defects using a triangle-based histogram
shape thresholding algorithm. The potholes and cracks on the defect images are identified based on
linearity of regions shapes. The linearity index is based on eccentricity value, where the eccentricity
properties are determined based on length of major axis (lmax), Position of centroid (Pcent) and Orientation
angle (α). Based on these properties if the eccentricity of the defect is larger than 0.99 then it is
2
CISCE-2023 IOP Publishing
IOP Conf. Series: Earth and Environmental Science 1326 (2024) 012100 doi:10.1088/1755-1315/1326/1/012100
distinguished as crack. For the identification of potholes, the surface texture inside a pothole is compared
with texture of surrounding region. To determine the textures smoothness, coarseness, granularity etc,
statistical approaches like grey intensity and filters are applied to grey level images used to define both
the inner and exterior texture. Based on geometric properties of the defect region the shape of pothole
is estimated using elliptical regression for the detection of potholes. These filter banks are tested with
varying lighting and viewing conditions. The method reaches an overall accuracy of 86%. This proposed
method, helps in automating the pavement distress detection process rather than relying on traditional
manual methods of inspection. Prof. Vijayalakshmi B et al. [4] presented an approach based on
Raspberry Pi for recognising humps and potholes and providing appropriate warnings to drivers. This
work is based on machine learning and image processing techniques. The hardware components include
a low-cost single board computer called Raspberry Pi. A Pi Camera of 5MP connected to raspberry pi,
an Arduino Microcontroller & Arduino IDE where the programs are written for Arduino board. OpenCV
and Anaconda environment are used for computer vision and to plot 2D/3D graphs. In order to remotely
capture the live feed from the camera to the laptop, Raspberry Pi cam interface is utilised. The system
workflow is by image/video acquisition by camera, store the images in a database which is used as
training set for program, The database is going to compare with the captured images. “im read” function
is used to read and preprocess the images. Blob detection is performed on the frame and blobs are
matched with images from training database images. And check if matching or not. Additionally,
ultrasonic sensors are utilised to recognize the potholes by estimating the profundity of the streets
individually. The system uses IOT to store the location of the identified pothole in the cloud. This
technique opens up new possibilities for driverless cars. Hanshen Chen [5] proposed a new system which
is location-aware convolutional neural networks (LCNN), as detection of potholes in road images is a
challenging task due to different scales, shadows and illumination effects. It consists of two primary
subnetworks: the first, called localization, uses a high recall network model to locate as many candidate
regions as feasible, and the second, called part-based, classifies the candidates the network is intended
to prioritise. The localization network (LCNN) introduces a point guided network that substitutes
traditional regression formulation to predict the location of the centre pothole point. LCNN is based on
ResNet50 which is used in image feature extraction. LCNN is trained on mean squared error (MSE) as
the loss between the predicted heatmaps and ground-truth heatmaps for this task. The LCNN first
predicts a heatmap that represents the pothole locations. If the input image is predicted as negative, will
not feed into the second stage of processing. In the second stage, the proposed part-based classification
network (PCNN) consists of two components: a part extraction and a binary classification network.
PCNN takes the proposed coordinates as input and uses the part extraction. Each local image region is
cropped from the original road image. This gives the cropped parts enough resolution (224 x 224 pixels)
to recognize potholes in detail whether it’s a pothole or pothole free image. A public pothole dataset,
from 2017 Data Science Hackathon composed of 4026 training and 1650 testing images with two
classes: positive (images that contain potholes) and negative (images without potholes). LCNN and
PCNN was initialized with pre-trained weights on the ImageNet classification to improve the speed.
Few augmentation techniques such as arbitrary brightness and contrast are improved for LCNN training.
The two subcomponents are pre trained independently for an epoch of 100. During test time, they are
concatenate to perform the end-to-end classification. The test data is performed on positive samples and
results of LCNN showed an accuracy of 97.7%. And all of the negative proposals are successfully
eliminated by the PCNN as it is based on detail extraction. This methodology can be further applied to
other distresses like bumps, cracks, or rutting. Tao Ma et al. [6] proposed employing DJI UAV with a
high-resolution Sony Alpha 7R camera system looking downwards to collect pavement distress
information through the images. All 3151 UAV pavement images were obtained from Dongji Avenue
in Nanjing, China at a focal length of 30m from ground so that entire road width is captured in a single
image. The dataset of 3151 images consists of six pavement distress types, like transverse crack (TC),
longitudinal crack (LC), alligator crack (AC), oblique crack (OC) and pothole. LabelImg tool is used to
annotate the images in the dataset. In order to classify these distresses automatically, three typical object-
detection algorithms- Faster R-CNN, YOLOv3, and YOLOv4 were used to train the dataset, and their
3
CISCE-2023 IOP Publishing
IOP Conf. Series: Earth and Environmental Science 1326 (2024) 012100 doi:10.1088/1755-1315/1326/1/012100
prediction performances are compared. The entire dataset is divided in 80% and 20% for training and
testing respectively. 10% of training dataset is further divided for validation purpose which helps in fine
tuning the hyperparameters. In the training process pre-trained weights were initialized from COCO
datasets in order to reduce the training period. All the three Faster R-CNN, YOLOv3, and YOLOv4
models are trained for same epoch value of 100. The performance metrics of these models are evaluated
by IOU and mAP values. The Faster R-CNN with a backbone of ResNet50 and Faster R-CNN with
VGG16 both exhibit inferior performance in pavement distress detection, with a mAP below 50%.
YOLOv3 demonstrated better overall performance, with a mAP of 56.6%. YOLOv3 and YOLOv4
generally performed better than the Faster R-CNN models. Compared with YOLOv4, YOLOv3 has a
higher mAP, indicating better performance. To verify the stability of these models, repeated predictions
were conducted using the six models and YOLOv3 demonstrated superior results with mAP of 56.6%.
The mAP values of Faster R-CNN are 48.8%, YOLO v3 is 56.6% and YOLO v4 is 53.3%. As a result,
the suggested pavement distress detection method employing a UAV is a practicable option and helps
with the inspection of non-destructive automatic pavement conditions. Pranjal A. Chitale et al. [7] has
implemented and compared the performance of multiple versions of YOLO (v3 and v4) pothole
detection system. And a triangular similarity measure to predict the dimensions. The model is built on
a custom dataset of 1300 images taken by a camera at an elevation of 90 cm from the ground of both
Indian and foreign roads with a mix of dry and wet potholes. These images are annotated in YOLO
format using LabelImg tool. The development of this study is divided into Pothole detection model and
Dimensions estimation model. In pothole detection model both Pothole detection YOLOv3 and
YOLOv4 models are trained for various epoch values of 3000,4000,5000 & 6000 epochs and
performance of these model are evaluated by mAP and IOU values with YOLO v3 giving maximum
accuracy of 88.9% at 4000 epoch and YOLO v4 giving maximum accuracy of 93.3% at 4000 epochs.
Further, the dimensions estimation model by using image processing to estimate dimensions by
considering an object of width ‘W.’ This object is placed at a distance of ‘D’ from the camera. By taking
a picture of this object from a distance ‘D,’ the apparent width in pixels ‘P’ is obtained. The dimensions
obtained in the form of pixels are converted into metric measurements by calculating the number of
pixels in an inch of the image. The dimensions estimated by the model is compared with ground
dimensions and its able to estimate the dimensions at an accuracy of 95%. Boris Bucko [8] implemented
a system with the primary objective of investigating how adverse conditions influence the accuracy of
pothole detection. A dataset consisting of weather subsets (viz., 1052 images in Clear weather, 286
images in Rainy, 201 images in Sunset, 250 images in Evening & 310 images Night) are recorded under
different light and weather was developed in the months of May, June and July by mounting a camera
onto car dashboard travelling at a speed of 40kmph and also few images from other online pothole
database. The entire dataset is then divided into 70-15-15% ratio for training, validation and testing
purpose respectively. The model is built using YOLOv3 algorithm with darknet 53 as backbone for
feature extraction. The performance of YOLOv3 is measured in terms of mAP detection accuracy and
is then compared with the Sparse R-CNN model. The YOLOv3 giving accuracy of 77% and Sparse R-
CNN model giving accuracy of 72%. Although Sparse R-CNN brought better results in low light
conditions, YOLOv3 proved better performance under brighter light conditions. Therefore, from the
results it is concluded that YOLOv3 is still a suitable alternative for pothole detection. Nienaber S et al.
[9] used image processing to identify the potholes on roads. A total of 48,913 images are captured using
a GoPro camera mounted on the vehicle moving at 60kmph to reflect the scenario of developing a device
that can be fitted to a vehicle for commercial use. Irrelevant information in the frame such as foliage
leading to false positives. Therefore, to extract the road surface even more accurately, a convex hull
algorithm was applied to the extracted contours. The suggested approach for pothole detection
commences by transforming the extracted road segment image into a grayscale representation. To clear
up this image and remove noise, a Gaussian filter was applied to the grayscale image. A simple
differentiation-based edge detection algorithm (Canny edge detection) is then performed on the
extracted road surface. Dilation on an image increases the area of the lighter pixels. As a result, when
dilation is performed, the unwanted edges close to the outer boundaries become absorbed into the outer
4
CISCE-2023 IOP Publishing
IOP Conf. Series: Earth and Environmental Science 1326 (2024) 012100 doi:10.1088/1755-1315/1326/1/012100
boundary leaving only the boundary contour visible. The contours are filtered and those that do not meet
the size constraints of the pothole model are discarded. The study indicated a precision of 81.8%. Due
to the nature of the algorithm, it was also determined that in the event that two potholes are closely
spaced together, they would be grouped together and seen as a single contour. The potholes ultimately
separated from one another far enough as the car drew nearer to make them distinct from one another.
It is evident that the algorithm will only yield accurate True Positive results for potholes between
approximately 2 m and 20 m ahead of the vehicle. The algorithm is successful in the detection of
potholes and does not rely on training any models. Katageri M et al. [10] introduced an automated
solution for the effective administration of road depressions within a ward, utilizing geotagging for the
acquired images gathered from the Brihanmumbai Municipal Corporation (BMC) through their website
voiceofcitizen.com. The images were then geotagged using their latitude and longitude information on
Google Earth, an open-source software. The input image converted to binary image in MATLAB. Edge
detection algorithms like Canny, Sobel, Prewitts, Roberts and Zerocross were then applied to these
binary images. Canny edge detection algorithm provided optimal detection of objects in an image. in
certain images, Canny could not efficiently determine the pothole edges. Hence, for these images, the
Zerocross edge detection algorithm was employed, relying on the intensity values at the pixel edges.
Once the edges are detected, the images are inverted to specify the main target in white. Statistical
analysis is done in google sheets demonstrates the count of potholes with a specific area in the range 8
to 13 sq. m. (medium range) was found to be maximum. With the development of this study, just by
clicking at the tagged places, the pothole images are seen along with their exact location as well as
description like area and depth used to estimate the quantity of filling material required. The tagged
information is stored in kml file that could be easily transferred via emails. Yusmeeraz Yusof Suhaila
Isaak et al [11] proposed automated pothole detection method by YOLO algorithm with DarkNet 53 as
backbone of the model. A dataset of total 330 sample sets is captured with a 3MP camera fixed to car
plate at 25 deg downward angle, the captured dataset is further divided into test dataset of 66 images
and training dataset of 264 images. The data annotation process is performed using LabelImg tool in
YOLO format on training dataset. The model is trained for an epoch value of 2000. Additionally, bokeh
and panda’s library is used to translate the logged location data (latitudes and longitudes) of detected
pothole locations to visualise on google maps Api. The developed pothole detection model by using
YOLO v3 algorithm provided an accuracy of 65.05% in detection of potholes. So, this setup can be used
to detect potholes and reduce the accidents related to potholes. Sritrusta Sukaridhoto et al [12] proposed
a pothole detection system using IMU sensor attached to embedded system, where the road defects are
recorded using gyro sensor, accelerometer and GPS attached on the head unit of vehicle moving at
speeds of 20kmph to 50kmph and later the recorded data is sent into NoSQL cloud computing database,
where the stored data is used for training purpose using Scikit learn to perform the predictions. Data
from gyroscope is used to visualise a pothole map using High chart library and maps are built using
leaflet library. For the analysis, Support Vector Machine and Decision tree algorithms are used for
classification of the data. This analysis shows Support vector machine has higher accuracy of 98% with
an error rate of eight bumps. The methodology applied here using gyro sensors is useful for detecting
potholes when the visibility of detecting potholes is none in stagnated flooded regions and puddles. M.R.
Rani et al [13] proposed a new system for identifying potholes and road bumps in vehicles in addition
to ADAS, where the ADAS system can detects vehicles, pedestrians, road lanes and signages only. Here,
the pothole and road bumps detection model are built using SSD (Single Stage Detection). SSD detects
objects in a single propagation enabling it to detect objects with greater speeds. The model is trained
using 500 images each of potholes and road bumps captured along Malaysian roads with camera
mounted on car windshield. In order to speed up the training process, transfer learning approach is
adopted where the weights are trained on COCO dataset and further tuned to avoid localisation loss,
which computes error between ground truth and predicted bounding box. In the testing phase, a new set
of data is used as input and the developed model is able to detect potholes and road bumps with an
accuracy of 60% and 70% respectively. However, not all potholes are spotted, with just 60% accuracy
in the test photographs due to the inability to recognise huge and remote potholes. This may be due to
5
CISCE-2023 IOP Publishing
IOP Conf. Series: Earth and Environmental Science 1326 (2024) 012100 doi:10.1088/1755-1315/1326/1/012100
the dataset does not contains enough data for that situation. Muhammad Haroon Asad et al [14] explored
the potential of deep learning models to deploy for pothole detection. Here, an AI kit named OAK-D is
setup on raspberry Pi for pothole detection and a comparison study of another object detector like
YOLOv1, YOLOv2, YOLOv3, YOLOv4, Tiny-YOLOv4, YOLOv5, and SSD-mobilenetv2. The object
detections are performed on OAK-D using Raspberry Pi as computer host. A dataset of 665 images
representing real world scenarios like shadows, vehicles etc is collected from an online database
consisting of 8000 potholes in whole dataset of images. The entire collected dataset is now spilt into
80:20 ratio for training and testing respectively. Different frameworks like Darknet, Pytorch and
Tensorflow are used to provide required tools and libraries for machine learning process. All the models
are trained by transfer learning utilising the weights from a pretrained model for an epoch of 20,000.
YOLO v5 model showed the highest accuracy of 95% in detecting the potholes. In contrast, SSD
Mobilenetv2 detects no potholes when there are several potholes in the scene. In this situation, though,
the YOLO v4 model operated flawlessly. Further for real time object detection in vehicles, OAK-D is
mounted on car dashboard to capture maximum road area and Raspberry Pi acted as host computer at
distance ranges (Long-Range = 10 m, Mid-Range = 5 m, Close-Range = 2 m). the work presented highest
accuracy for real time pothole detection with 90% detection and therefore can be applied for rapid and
optimised actions for road maintenance. K Gajjar et al [15] compared different deep learning object
detection model namely Faster RCNN, SSD and YOLO v3 to find which model provides best results
with the same dataset. A total of 1910 images captured by GoPro Hero 3+ is used to build image data to
all these three models which are collected from St. Francis Bay and Jeffrey’s Bay in the Eastern Cape
Province in South Africa and also from an online database. Now, the entire dataset is divided into
positive and negative dataset where the positive dataset contained image of potholes and negative dataset
contained images of non-potholes and later split in ratio of 70:20 which is used for training and testing
purpose. The YOLO model is trained using weights of a pretrained model on ImageNet database. The
final YOLO model is setup on a car dashboard travelling at 60kmph to test the detection accuracy of the
model. The results revealed that although Faster R-CNN was more accurate than SSD. Although SSD
was fast, the accuracy of the detection was unsatisfactory. Through thorough research, it was revealed
that the YOLOv3 network performs better for real-time applications because the detection time per
object is less when compared to Faster R-CNN and SSD. Once YOLO v3 model is proven with higher
accuracy, the GPS coordinates are saved and sent to cloud server. This research help to add a warning
to the driver on the heads-up display of the vehicle. Lu Wang [16] studied the image restoration
technology where the images obtained may be disturbed due to haze, fuzzy, noisy etc. As the current
motion-based image restoration technology and CNN based image restoration technology are difficult
to deal with external factors. Currently, two types of image restoration tech, namely non-stationary
image processing and non-linear processing. The former is represented by Kalman filter, and the latter
is typical application of neural network. Here, one of the methods to follow degradation is to collect
sub-image which is affected by less noise, and construct an estimated image. Secondly, build a
degradation model taking environmental factors into account like rain drops, haze etc. As it is difficult
to detect the target objects in these types of situations, a 3x3 kernel is replaced with multi-scale features
at more fine-grained level to enhance the field of view. This application of of computer deep learning
image restoration tech in road pothole detection improves road maintenance. Susmita Patra Asif Iqbal
Middya Sarbani Roy [17] provided end solutions of detecting potholes and spatial mapping through
smartphone application. This method allows people to know about the locations of potholes and help
avoid potential road accidents. Here, an IOT based app called POTSPOT is built for pothole monitoring
and spatial mapping across the city. A custom CNN model is proposed for real-time pothole detection.
The collected images are categorized into two classes: pothole and non-pothole images. A total of 3424
images are collected through smartphone out of which 3264 images have been randomly selected to
train the model where the images are divided into training and validation set images. Rest 160 images
have been selected to test the model. The performance of proposed CNN model is evaluated on real
world dataset of 3424 images. To validate the efficiency, the proposed model is compared with SVM,
ANN, KNN, InceptionV3, VGG16, VGG19. The model is built on CNN architecture to perform the
6
CISCE-2023 IOP Publishing
IOP Conf. Series: Earth and Environmental Science 1326 (2024) 012100 doi:10.1088/1755-1315/1326/1/012100
predictions and the performance is measured through test dataset containing a total of 160 images, 80
images for each class. The result shows the model is able to predict potholes with an accuracy of 97%
compared with 92.5% for InceptionV3, 91.25% for VGG19, 88.75% for VGG16, 79.71% for SVM,
78.12% for ANN and 63.77% for KNN. Additionally, the newly built mobile application POTSPOT
directly shows the location of the detected potholes input into the app. This mobile application is
developed in Android Studio IDE using two programming languages: Java and XML (Extensible
Markup Language). Android activities are created by using java programming language whereas the UI
(User Interface) templates were designed with XML. The performance of the model is studied by
conducting a case study in urban Kolkata city in India. Therefore, this system named POTSPOT is
developed for monitoring and mapping of potholes, enabling policy makers identify and repair potholes.
And the model developed by CNN model outperformed SVM, ANN, KNN, InceptionV3, VGG16 and
VGG19. A Ab Rahman et al [18] used DJI Phantom 3 professional UAV equipped with LiDAR sensor
as an alternative to surveying process for obtaining aerial images. The LiDAR sensor used is for
volumetric calculations as an alternate use of tacheometry method. The UAV flight altitude is set at 20m
from ground. Photogrammetric software’s are used to calculate the volume of the objects, as they can
produce three-dimensional modelling from aerial photo. In order to obtain good results, GCP (Ground
Control Point) is recorded with WGS84 Projection. And further, the volume of the objects is calculated
using Agisoft Software. Therefore, the methodology adopted in this paper helps in obtaining volume of
objects by UAV which is economical, consumes less time and money than conventional surveying
techniques. Imam Sutrisno [19] proposed a pothole detection tool using GLCM (Grey level Co-
occurrence Matrix) and NN (Neural Network). Where, GLCM is used for feature extraction from the
images and NN is used for training purpose. The images are collected by a digital camera and converted
into gray scaling image so that GLCM can obtain texture of image using degree of contrast, granularity
and roughness and relation between pixels in the image through histograms. The GLCM model trained
on NN is tested with a 10 images of test data to detect potholes with an accuracy of 86%. Therefore, this
study shows that GLCM can be used as an alternative method for feature extraction from an image.
Hasan Hamodi Joni et al [20] provided an overview for classification of pavement distresses like
patches, potholes, cracks etc by image analysis and processing techniques using MATLAB codes. The
first code dealing with video data and the second code processed image data. The data is captured using
camera lens positioned vertically down and parallel to the traffic flow. Gaussian filter is applied to the
video to remove noise. The image MATLAB code was trained on more than 360 images of different
pavement failures and tested on 40 images which resulted in 77.5% accuracy. The usage of the
developed Automatic Evaluation of pavement (AEOP) model is compared with traditional surveying
(Manual, Vision) methods on selected 10 sections of a road and it was found the surveying time by
AEOP was 25 minutes whereas the traditional surveying methods by vision consumed 310 minutes.
This paper presented the use and safety benefits of employing image processing to measure pavement
surface defects. R Sulistyowati, A Suryowinoto, H A Sujono and I Iswahyudi [21] study identified
potholes on the roadway and reporting damage, and establish a road contour damage information system
on google maps, so road users can be careful when they pass the road. This method is developed using
Werner D. Streidt algorithm threshold values and edge detection in digital image processing to detect
and locate pothole using coordinates on google maps. To build this model, Raspberry Pi 2 is used with
two embedded devices of 6M NEO U-box as GPS sensor and a CSI camera interface for data capturing.
All the data is captured in daytime conditions in sunny weather. These images are divided based on their
intensity, colour. The detection of model is in the form of changes in colour contrast between the area
around the pothole and rest of the image. The detected pothole image is converted into binary image
using Werner D Striedt algorithm then detect the edges using Canny method. Once the model detects
the pothole, 6M NEO U-box as GPS sensor saves the location data of the pothole. Google map API
script in GMLib creates pinpoint geotagging locations. The final model is tested in a vehicle moving at
speed of 40 kmph and is able to detect potholes with a success rate of 67%. Therefore, this model helps
the commuters get alerted regarding the location of potholes on google maps such that accidents due to
potholes can be avoided. S D Batanov, O S Starostina and I A Baranova [22] proposed non-contact
7
CISCE-2023 IOP Publishing
IOP Conf. Series: Earth and Environmental Science 1326 (2024) 012100 doi:10.1088/1755-1315/1326/1/012100
remote measuring through digital technologies. In this paper, the author applied this study on a sample
population of 75 cows to assess the measurement of its chest depth and width, rump length and width,
and body length. The measurements are obtained using photos. Measuring the points using photos is
made with use of perspectometer with known dimensions in the photo. In this case, a measuring stick is
used as a perspectometer. The images are captured by a digital camera fixed onto a tripod in three views:
front view, side view and back view. And the dimensions of length and breadth are obtained in terms of
pixels on the image. Further in finding the third dimension of the object, i.e., depth. A structure sensor
3d depth camera is used to capture comprising of infrared lasers, sensors and special backlight. The
infrared forms dotted patterns on objects from a distance of 3.5 meters and record the pattern distortions.
Thus, a depth map is created. The dimensions complemented using perspectometer and structure sensor
3d provides all three dimensions (length, breadth and width) of an object. The results from this
methodology provided dimension with an error rate of 2% making it feasible of for non-contact linear
measurements by implementing digital technologies thereby consuming less time.
In summary, the literatures provided insights into different types of data capturing methods for potholes
(such as by application of fisheye cameras, multispectral cameras, UAV’s) and other sensor-based data
capturing techniques by gyroscopes and accelerometers. The Feasibility of Neural Networks and Deep
learning in automating road maintenance programs, their advantages in application of CNN
(Convolutional Neural Network) with different types of algorithms (such as RCNN, Faster RCNN,
Sparse RCNN, SSD and different versions of YOLO), training process, transfer learnings to develop
models available based on its usage. And also, performance comparisons of each type of algorithm
which can help choose the best sort of algorithm suited for intended tasks. Additionally, few other
possible alternatives to deep learning models which do not require training process to build a detection
model that can be adopted are explored such as by using contour detection methods, programming a
Raspberry Pi device, detection using GLCM for feature extractions and Werner D. Streidt algorithm that
can also help in detection of pavement distresses. Further, few dimension estimation methods are
explained where the dimensions of an object can be obtained from an image through applying Inverse
Perspective Mapping, Canny edge contour detection filters, application of LiDAR’s, triangular
similarities and perspectometers.
2. Methodology
8
CISCE-2023 IOP Publishing
IOP Conf. Series: Earth and Environmental Science 1326 (2024) 012100 doi:10.1088/1755-1315/1326/1/012100
9
CISCE-2023 IOP Publishing
IOP Conf. Series: Earth and Environmental Science 1326 (2024) 012100 doi:10.1088/1755-1315/1326/1/012100
are now in the form of pixels and are now converted into metric measurement using the scaling ratio
obtained by dividing the calibrated image resolution to the detected pothole image resolution resulting
in the Spatial resolution factor. The detected input pothole image is multiplied with Spatial resolution
factor to get the scaled dimensions of bounding boxes in terms of pixels which is divided by fixed 96
PPI resulting in metric measurement of the potholes in inches.
In our example, the fixed focal length has provided an image with 54 inches giving us: Number of
pixels/inch value of 54/96.
With the use of Spatial resolution factor, dimensions of the potholes can be obtained without
compromising on image resolution with a fixed value of focal length and PPI of the image. System
methodology is shown in Figure 3.
10
CISCE-2023 IOP Publishing
IOP Conf. Series: Earth and Environmental Science 1326 (2024) 012100 doi:10.1088/1755-1315/1326/1/012100
Once the pothole detection model provides prediction results, the coordinates of the bounding boxes
are extracted by the Dimension detection model. Figure 7 shows the output of the dimension detection
model with ground dimensions mentioned below. These dimensions are validated with the ground
dimensions for estimating the accuracy level of dimensions. The mean error rate of the model is 12
percent. A comparison table of actual ground measurements and detected measurements by the model
are shown in Table 1.
11
CISCE-2023 IOP Publishing
IOP Conf. Series: Earth and Environmental Science 1326 (2024) 012100 doi:10.1088/1755-1315/1326/1/012100
The system proposed in this paper helps automating the process thereby reducing time and labour
for the maintenance of roads and help plan an effective and efficient road maintenance program. The
dimension estimator helps in analysing the intensity of potholes by their area which further helps in
prioritising the maintenance works for the engineers.
References
[1] Road accidents in India 2021 (New Delhi: Ministry of Road Transport & Highways)
[2] Roboflow, https://universe.roboflow.com/
[3] Christian K A, Ioannis B 2011 Pothole Detection in Asphalt Pavement Images Advanced
Engineering Informatics 25 507-15 DOI:10.1016/j.aei.2011.01.002
[4] Qurishee M A 2019 Low-cost deep learning UAV and Raspberry Pi solution to real time pavement
condition assessment (Chattanooga: University of Tennessee at Chattanooga)
[5] Chen H, Yao M, Gu Q 2020 Pothole Detection Using Location-Aware Convolutional Neural
Networks International Journal of Machine Learning and Cybernetics 11 899–911 DOI:
https://doi.org/10.1007/s13042-020-01078-7
[6] Tao Ma, Junqing Z, Jingtao Z, Xiaoming H, Weiguang Z, Yang Z, 2022 Pavement Distress
Detection Using Convolutional Neural Networks with Images Captured Via UAV Automation in
Construction 133 DOI: https://doi.org/10.1016/j.autcon.2021.103991
[7] Chitale Pranjal, Kekre Kaustubh, Shenai Hrishikesh, Karani Ruhina, Gala Jay 2020 Pothole
Detection and Dimension Estimation System Using Deep Learning (YOLO) and Image
Processing 35th Int. Conf. on Image and Vision Computing New Zealand (New Zealand: IEEE)
pp: 1-6 DOI:10.1109/IVCNZ51579.2020.9290547
[8] Bučko B, Lieskovská E, Zábovská K, Zábovský M 2022 Computer Vision Based Pothole Detection
under Challenging Conditions Sensors for Smart Vehicles Applications 22 DOI:
https://doi.org/10.3390/s22228878
[9] Nienaber S, Booysen M J, Kroon R S 2015 Detecting Potholes Using Simple Image Processing
Techniques and Real-World Footage 34th Annual Southern African Transport Conference
(Pretoria: University of Pretoria) DOI:10.13140/RG.2.1.3121.8408
[10] Katageri M, Mandal M, Gandhi M, Koregaonkar N, Sengupta S 2016 Automated Management of
Pothole Related Disasters Using Image Processing and Geotagging International Journal of
Computer Science & Information Technology (IJCSIT) 7 DOI:
https://doi.org/10.5121/ijcsit.2015.7608
[11] Yik Y K Alias, N E, Yusof Y, Isaak S 2021 A real-time pothole detection based on deep learning
approach Journal of physics: Conference series 1828 IOP Publishing
12
CISCE-2023 IOP Publishing
IOP Conf. Series: Earth and Environmental Science 1326 (2024) 012100 doi:10.1088/1755-1315/1326/1/012100
[12] Ulil A M R, Sukaridhoto S, Tjahjono A, Basuki D K 2019 The vehicle as a mobile sensor network
base IoT and big data for pothole detection caused by flood disaster IOP Conference Series: Earth
and Environmental Science 239 12-34 IOP Publishing
[13] Rani M R, Mustafar M Z C, Ismail, N H F, Mansor, M S F, Zainuddin Z 2021 Road peculiarities
detection using deep learning for vehicle vision system IOP Conference Series: Materials Science
and Engineering 1068 IOP Publishing
[14] Muhammad Haroon Asad, Saran Khaliq, Muhammad Haroon Yousaf, Muhammad Obaid Ullah,
Afaq Ahmad 2022 Pothole Detection Using Deep Learning: A Real-Time and AI-on-the-Edge
Perspective Advances in Civil Engineering 2022 https://doi.org/10.1155/2022/9221211
[15] Gajjar K, van Niekerk T, Wilm T, Mercorelli P 2022 Vision-based deep learning algorithm for
detecting potholes Journal of Physics: Conference Series 2162 12-19 IOP Publishing DOI:
10.1088/1742-6596/2162/1/012019
[16] Wang L 2021 Research on Road Pothole Detection Method Based on Computer Image Restoration
Technology 2021 Journal of Physics: Conference Series 1992 IOP Publishing
https://doi.org/10.1088/1742-6596%2F1992%2F3%2F032028
[17] Patra S, Middya AI, Roy S PotSpot: Participatory sensing-based monitoring system for pothole
detection using deep learning 2021 Multimed Tools Applications 80 25171-25195
https://doi.org/10.1007/s11042-021-10874-4
[18] Ab Rahman A A, Maulud K A, Mohd F A, Jaafar O, Tahar K N 2017 Volumetric calculation using
low cost unmanned aerial vehicle (UAV) approach IOP Conference Series: Materials Science and
Engineering 270 12-32 IOP Publishing DOI:10.1088/1757-899X/270/1/012032
[19] Sutrisno I, Syauqi A W, Hasin M K, Rahmat M B, Asmara I P S, Wiratno D, Setiawan E 2020
Design of pothole detector using gray level co-occurrence matrix (GLCM) and neural network
(NN) IOP Conference Series: Materials Science and Engineering 874 12-12 IOP Publishing DOI:
10.1088/1757-899X/874/1/012012
[20] Joni H H, Alwan I A, Naji G A 2020 Investigations of the road pavement surface conditions using
MATLAB image processing IOP Conference Series: Materials Science and Engineering 737 IOP
Publishing DOI: 10.1088/1757-899X/737/1/012133
[21] Sulistyowati R, Suryowinoto A, Sujono H A, Iswahyudi I 2021 Monitoring of road damage
detection systems using image processing methods and Google Map IOP Conference Series:
Materials Science and Engineering 1010 12-17 IOP Publishing DOI: 10.1088/1757-
899X/1010/1/012017
[22] Batanov S D, Starostina O S, Baranova I A 2019 Non-contact methods of cattle conformation
assessment using mobile measuring systems IOP Conference Series: Earth and Environmental
Science 315 32-36 IOP Publishing DOI: 10.1088/1755-1315/315/3/032006
[23] Yifan P, Xianfeng Z, Guido C, Liping Y, 2018 Detection of Asphalt Pavement Potholes and Cracks
Based on the Unmanned Aerial Vehicle Multispectral Imagery IEEE Journal of Selected Topics
in Applied Earth Observations and Remote Sensing 11 1-12 DOI:
https://doi.org/10.1109/JSTARS.2018.2865528
[24] Kharel S, Ahmed K R 2022 Potholes Detection Using Deep Learning and Area Estimation Using
Image Processing Intelligent Systems and Applications 296 373–88 DOI:
https://doi.org/10.1007/978-3-030-82199-9_24.
[25] Label-Img, https://github.com/heartexlabs/labelImg
[26] Ultralytics, YOLO algorithm, https://docs.ultralytics.com/
[27] Introduction to YOLO v8, https://github.com/ultralytics/ultralytics
13