A Pavement Crack Detection and Evaluation Framework for a
A Pavement Crack Detection and Evaluation Framework for a
A Pavement Crack Detection and Evaluation Framework for a
sciences
Article
A Pavement Crack Detection and Evaluation Framework for a
UAV Inspection System Based on Deep Learning
Xinbao Chen * , Chang Liu *, Long Chen, Xiaodong Zhu, Yaohui Zhang and Chenxi Wang
School of Earth Sciences and Spatial Information Engineering, Hunan University of Sciences and Technology,
Xiangtan 411201, China; 1901080111@mail.hnust.edu.cn (L.C.); xiaodong@mail.hnust.edu.cn (X.Z.);
eminem@mail.hnust.edu.cn (Y.Z.); 2101080215@mail.hnust.edu.cn (C.W.)
* Correspondence: xchen@hnust.edu.cn (X.C.); liuchang@mail.hnust.edu.cn (C.L.); Tel.: +86-18670925719 (X.C.);
+86-18845160205 (C.L.)
Abstract: Existing studies often lack a systematic solution for an Unmanned Aerial Vehicles (UAV)
inspection system, which hinders their widespread application in crack detection. To enhance its
substantial practicality, this study proposes a formal and systematic framework for UAV inspection
systems, specifically designed for automatic crack detection and pavement distress evaluation. The
framework integrates UAV data acquisition, deep-learning-based crack identification, and road
damage assessment in a comprehensive and orderly manner. Firstly, a flight control strategy is
presented, and road crack data are collected using DJI Mini 2 UAV imagery, establishing high-quality
UAV crack image datasets with ground truth information. Secondly, a validation and comparison
study is conducted to enhance the automatic crack detection capability and provide an appropriate
deployment scheme for UAV inspection systems. This study develops automatic crack detection
models based on mainstream deep learning algorithms (namely, Faster-RCNN, YOLOv5s, YOLOv7-
tiny, and YOLOv8s) in urban road scenarios. The results demonstrate that the Faster-RCNN algorithm
achieves the highest accuracy and is suitable for the online data collection of UAV and offline
inspection at work stations. Meanwhile, the YOLO models, while slightly lower in accuracy, are the
fastest algorithms and are suitable for the lightweight deployment of UAV with online collection and
real-time inspection. Quantitative measurement methods for road cracks are presented to assess road
Citation: Chen, X.; Liu, C.; Chen, L.; damage, which will enhance the application of UAV inspection systems and provide factual evidence
Zhu, X.; Zhang, Y.; Wang, C. A for the maintenance decisions made by road authorities.
Pavement Crack Detection and
Evaluation Framework for a UAV Keywords: road cracks; UAV; deep learning; target detection; road damage evaluation; framework
Inspection System Based on Deep
Learning. Appl. Sci. 2024, 14, 1157.
https://doi.org/10.3390/
app14031157
1. Introduction
Academic Editor: Luís Picado Santos Roads are one of the crucial transportation infrastructures that deteriorate over time,
Received: 13 December 2023
due to factors such as heavy vehicles, changing weather conditions, human activity, and
Revised: 20 January 2024 the use of inferior materials. This deterioration impacts economic development, travel
Accepted: 27 January 2024 safety, and social activities [1]. Therefore, it is crucial to periodically assess the condition of
Published: 30 January 2024 roads to ensure their longevity and safety. Additionally, it is imperative to accurately and
promptly identify road damage, especially cracks, in order to prevent further deterioration
and enable timely repairs.
Currently, pavement condition inspection technologies mainly include traditional
Copyright: © 2024 by the authors. manual measurements and automatic distress inspections, such as vehicle-mounted in-
Licensee MDPI, Basel, Switzerland. spection [2]. Manual inspection methods rely primarily on visual discrimination, requiring
This article is an open access article personnel to travel along roads to identify damage points. However, this approach is slow,
distributed under the terms and
laborious, subjective, time-consuming, and has a lower accuracy [3]. Therefore, the devel-
conditions of the Creative Commons
opment of automatic inspection technologies is crucial for quickly and accurately detecting
Attribution (CC BY) license (https://
and identifying cracks on the road. In recent years, intelligent crack inspection systems
creativecommons.org/licenses/by/
have gained increasing attention and application, such as vehicle-mounted inspections and
4.0/).
their intelligent systems [4]. Guo et al. [5] utilized core components such as on-mounted
high-definition image sensors, laser sensors, and infrared sensors, etc. These components
enable the acquisition of high-precision road crack data in real-time. However, the overall
configuration of the vehicle-mounted system is expensive and limited in scope, making it
challenging to widely apply [2].
Notably, automatic pavement distress inspection has traditionally utilized image-
processing techniques such as Gabor filtering [6], edge detection, intensity thresholding [7],
and texture analysis. Cracks are identified by analyzing the changes in edge gradients
and intensity differences compared to the background, and then extracting them through
threshold segmentation [2]. However, these methods are highly influenced by environmen-
tal factors, including lighting conditions, which can affect their accuracy. Moreover, these
methods are not effective when the camera configurations vary, making their widespread
use impractical [1,8]. Given the limitations of these traditional approaches, it is crucial to
develop a cost-effective, accurate, fast, and independent method for the accurate detection
of road cracks.
In recent years, there have been significant advancements in machine learning and
deep learning algorithms, leading to the emergence of automatic deep learning methods
as accurate alternatives to traditional object recognition methods. These methods have
shown immense potential in visual applications and image analysis, particularly in road
distress inspection [1,8]. Krizhevsky et al. [9] proposed a deep convolutional neural net-
work (CNN) architecture for image classification, especially in the detection of distresses
in asphalt pavements. Cao et al. [3] presented an attention-based crack network (AC-
Net) for automatic pavement crack detection. Extensive experiments on the CRACK500
demonstrated that ACNet achieved a higher detection accuracy compared to eight other
methods. Tran et al. [10] utilized a supervised machine learning network called RetinaNet
to detect and classify various types of cracks that had developed in asphalt pavements,
including lane markers. The validation results showed that the trained network model
achieved an overall detection and classification accuracy of 84.9%, considering both the
crack type and severity level. Xiao et al. [11] proposed an improved model called C-Mask
RCNN, which enhances the quality of crack region proposal generation through cascad-
ing multi-threshold detectors. The experimental results indicated that the mean average
precision of the C-Mask RCNN model’s detection component was 95.4%, surpassing the
conventional model by 9.7%. Xu K et al. [12] also proposed a crack detection method based
on an improved Faster-RCNN for small cracks in asphalt pavements, even under complex
backgrounds. The experiments demonstrated that the improved Faster-RCNN model
achieved a detection accuracy of 85.64%. Xu X et al. [13] conducted experiments to evaluate
the effectiveness of Faster R-CNN and Mask R-CNN and compared their performances in
different scenarios. The results showed that Faster R-CNN exhibited a superior crack detec-
tion accuracy compared to Mask R-CNN, while both models demonstrated efficiency in
completing the detection task with small training datasets. The study focuses on comparing
Faster R-CNN and Mask R-CNN, but does not compare the proposed methods with other
existing crack detection methods. In general, these above-mentioned methods not only
detect the category of an object, but also determine the object’s location in the image [14].
The use of deep learning methods can reduce labor costs and improve work efficiency and
intelligence in recognizing road cracks [1].
Meanwhile, unmanned aerial vehicles (UAV) have demonstrated their versatility in a
wide range of applications, including urban road inspections. This is attributed to their
exceptional maneuverability, extensive coverage, and cost effectiveness [2]. Equipped with
high-resolution cameras and various sensors, these vehicles can capture images of the
road surface from multiple angles and heights, providing a comprehensive assessment
of its condition. Several researchers have utilized UAV imagery to study deep learning
methods for road crack object detection, and they have achieved impressive accuracy
results. Yokoyama et al. [15] proposed an automatic crack detection technique using arti-
ficial neural networks. The study focused on classifying cracks and non-cracks, and the
Appl. Sci. 2024, 14, 1157 3 of 23
algorithm achieved a success rate of 79.9%. Zhu et al. [2] utilized images collected by a
UAV to conduct experimental comparisons of three deep learning target detection methods
(Faster R-CNN, YOLOv3, and YOLOv4) via convolutional neural networks (CNN). The
study verified that the YOLOv3 algorithm is optimal, with an accuracy of 56.6% mAP. In
another study, Jiang et al. [16] proposed an RDD-YOLOv5 algorithm with Self-Attention
for UAV road crack detection, which significantly improved the accuracy with an mAP of
91.48%. Furthermore, Zhang et al. [17] proposed an improved YOLO3 algorithm for road
damage detection from UAV imagery, incorporating a multi-layer attention mechanism.
This enhancement resulted in an improved detection accuracy with an mAP of 68.75%.
Samadzadegan et al. [1] utilized the YOLOv4 deep learning network and evaluated its
performance using various metrics such as F1-score, precision, recall, mAP, and IoU. The
results showed that the proposed model had an acceptable performance in road crack
recognition. Additionally, Zhou et al. [18] introduced a UAV visual inspection method
based on deep learning and image segmentation for detecting cracks on crane surfaces.
Moreover, Xiang et al. [19] presented a lightweight UAV road crack detection algorithm
called GC-YOLOv5s, which achieved an accuracy validation mAP of 74.3%, outperforming
the original YOLOv5 by 8.2%. Wang et al. [20] introduced BL-YOLOv8, an improved road
defect detection model that enhances the accuracy of detecting road defects compared to
the original YOLOv8 model. BL-YOLOv8 surpasses other mainstream object detection
models, such as Faster R-CNN, SDD, YOLOv3-tiny, YOLOv5s, YOLOv6s, and YOLOv7-
tiny, by achieving detection accuracy improvements of 17.5%, 18%, 14.6%, 5.5%, 5.2%,
2.4%, and 3.3%, respectively. Furthermore, Omoebamije et al. [21] proposed an improved
CNN method based on UAV imagery, demonstrating a remarkable accuracy of 99.04% on
a customized test datasets. Lastly, Zhao et al. [22] proposed a highway crack detection
and CrackNet classification method using UAV remote sensing images, achieving 85% and
78% accuracy for transverse and longitudinal crack detection, respectively. These aforemen-
tioned studies primarily aim to enhance the deep learning algorithm using UAV images.
This enhancement improves the accuracy of road crack detection and also establishes
the methodological foundation for the crack target recognition algorithm discussed in
this paper.
However, most of the above-mentioned studies primarily focused on UAV detection
algorithms and neglected UAV data acquisition and high-quality imagery integrated into
detection methods. For instance, the flight settings required for capturing high-quality
images have not been thoroughly studied [2]. Flying too high or too fast may result in
poor-quality images [22]. Zhu et al. [2] and Jiang et al. [16] both introduced flight setup and
experimental tricks for efficient UAV inspection. Liu K.C. et al. [23] proposed a systematic
solution for automatic crack detection for UAV inspection. These studies are still incomplete
due to a lack of detailed data acquisition and pavement distress assessment. Additionally,
there is a lack of quantitative measurement methods for cracks, which hampers accurate
data support for road distress evaluation. Furthermore, inconsistency in flight altitude and
the absence of ground real-scale information of cracks adversely impact the subsequent
quantitative assessment of cracks.
Obviously, existing studies frequently lack a systematic solution or integrated frame-
work for UAV inspection technology, which hinders its widespread application in pavement
distress detection. Therefore, this study aims to propose a formal and systematic framework
for automatic crack detection and pavement distress evaluation in UAV inspection systems,
with the goal of making them widely applicable.
Our proposed framework for a UAV inspection system for automatic road crack
detection offers several advantages: (1) It demonstrates a more systematic solution. The
framework integrates data acquisition, crack identification, and road damage assessment
in orderly and closely linked steps, making it a comprehensive system. (2) It exhibits
a greater robustness. By adhering to the flight control strategy and model deployment
scheme, the drone ensures high-quality data collection while employing state-of-the-art
automatic detection algorithms based on deep learning models that guarantee accurate
Appl. Sci. 2024, 14, 1157 4 of 23
crack identification. (3) It presents an enhanced practicality. The system utilizes the cost-
effective DJI (DJ-Innovations Company, headquartered in Shenzhen, China) Min2 drone for
imagery acquisition and DL-based model deployment, making it an economically viable
solution with significant potential for widespread implementation.
The rest of this paper is organized as follows: Section 2 presents the framework for the
UAV inspection system designed specifically for pavement distress analysis. In Section 3,
we provide a comprehensive overview of four prominent deep-learning-based crack de-
tection algorithms, namely Faster-RCNN, YOLOv5s, YOLOv7-tiny, and YOLOv8s, along
with their distinctive characteristics. Section 4 elaborates on the well-defined procedures
employed for UAV data acquisition and subsequent data reprocessing. The experimental
setup and comparative results are presented in Section 5. In Section 6, we propose quan-
titative methods to evaluate road cracks and assess pavement distress levels. Finally, in
Section 7, we summarize our research while discussing future work.
Figure 1. A framework of UAV inspection system based on deep learning for pavement distress.
Figure 1. A framework of UAV inspection system based on deep learning for pavement distress.
3. Deep Learning Algorithms
In recent years, there has been significant progress in deep learning technology, lead-
ing to a paradigm shift in target detection methods from traditional algorithms based on
manual features to deep-neural-network-based detection methods [24]. These deep learn-
ing algorithms can be categorized into two major approaches (Figure 2): (1) Two-stage
methods (two-stage algorithms), which involve labeling multiple target candidate regions
in the image and subsequently classifying and regressing the boundary of each candidate
Appl. Sci. 2024, 14, 1157 5 of 23
Figure 1. A framework of UAV inspection system based on deep learning for pavement distress.
Figure3.3.An
Figure Anillustration
illustrationof
of Faster-RCNN
Faster-RCNN (modified
(modified from
from[29]).
[29]).
3.2.YOLO
3.2. YOLOSeries
SeriesAlgorithms
Algorithms
TheYOLO
The YOLO series
series algorithm
algorithmisisaatypical
typicalrepresentative
representative example
exampleof the
of one-stage algo-al-
the one-stage
rithm target
gorithm targetdetection
detectionmodel. In comparison
model. In comparison to thetoFaster RCNNRCNN
the Faster algorithm, YOLO elimi-
algorithm, YOLO
nates the need
eliminates to extract
the need candidate
to extract regionsregions
candidate that may contain
that targets. It
may contain completes
targets. the de-
It completes
tection
the task using
detection only one
task using onlynetwork and predicts
one network the category
and predicts and location
the category of the target
and location of the
object in the detection output through regression. Currently, YOLOv5 is
target object in the detection output through regression. Currently, YOLOv5 is the initial the initial model
of theof
model series, whichwhich
the series, has been
hasproven to be stable
been proven to be and
stable is and
widely used inused
is widely lightweight road
in lightweight
crack detection methods due to its excellent accuracy [17,21]. YOLOv5
road crack detection methods due to its excellent accuracy [17,21]. YOLOv5 consists of consists of several
networks
several with different
networks depths, depths,
with different namely n, s, m, l,n,and
namely s, m,x. The depth
l, and anddepth
x. The widthand of the net- of
width
work
the increase
network in the order
increase in theoforder
n, s, m, l, and
of n, s, m,x.l,Among
and x. these
Among options,
theseYOLOv5s is suitable is
options, YOLOv5s
for small
suitable fordeep networks
small or small-scale
deep networks datasets.datasets.
or small-scale
Thenetwork
The networkarchitecture
architecture of
of YOLOv5
YOLOv5 is is depicted
depictedin inFigure
Figure4.4.The
Themodel
modelcomprises
comprises
three main components: the backbone network (BackBone), the
three main components: the backbone network (BackBone), the neck network (Neck), neck network (Neck), and
and
the head detection network (Head). The backbone network (Backbone) primarily performs
feature extraction by utilizing a convolutional network to extract object information from
the image. This information is then used to create a feature pyramid, which is later
employed for target detection. The backbone network consists of various modules, such
as the Focus module, Conv module, C3 module, and SPFF module. Notably, the SPPF
(Spatial Pyramid Pooling Faster) module is capable of converting feature maps of any size
into fixed-size feature vectors. This allows for the fusion of local and global features at the
Feather Map level and further expands the receptive field of the feature map. Consequently,
objects can be effectively detected even when input at different scales. The neck network
(Neck) is responsible for the multi-scale feature fusion of the feature map. It adopts the
structure of the Feature Pyramid Network (FPN) and the Path Aggregation Network (PAN),
which enhances the model’s ability to capture object features at various scales and improves
the accuracy and performance of target detection. The head network (Head), also known
as the detection module, utilizes techniques like anchor boxes to process the input feature
mapping and generate regression predictions. These predictions include information about
the type, location, and confidence of the crack detection object.
YOLOv7 [27] is an enhanced target detection framework based on YOLOv5. It in-
corporates a deeper network structure and robust methods, resulting in an improved
accuracy and speed compared to YOLOv5. YOLOv7 introduces several techniques, such
as Long-Range Attention Off Network (ELAN) and Bottleneck Attention Module (BAM),
to enhance its learning capability. ELAN expands, shuffles, and merges the quantity (Car-
dinality), thereby improving the equilibrium state of the learning network. To prevent
overfitting, YOLOv7 employs a regularization method similar to DropBlock. This regular-
ization method enhances the stability and robustness of the model, enabling it to be trained
on larger datasets.
adopts the structure of the Feature Pyramid Network (FPN) and the Path Aggregation
Network (PAN), which enhances the model’s ability to capture object features at various
scales and improves the accuracy and performance of target detection. The head network
(Head), also known as the detection module, utilizes techniques like anchor boxes to pro-
Appl. Sci. 2024, 14, 1157 7 of 23in-
cess the input feature mapping and generate regression predictions. These predictions
clude information about the type, location, and confidence of the crack detection object.
Network architecture
Figure4.4.Network
Figure architecture of
of YOLOv5.
YOLOv5.
YOLOv8 [27]
YOLOv7 [28] was
is anreleased
enhancedin January 2023 by Ultralytics,
target detection framework thebased
company that developed
on YOLOv5. It incor-
YOLOv5. YOLOv8 further optimizes the model structure and training
porates a deeper network structure and robust methods, resulting in an improvedstrategy based accu-
on
YOLOv7 to enhance both detection speed and accuracy. Notably, YOLOv8 incorporates
racy and speed compared to YOLOv5. YOLOv7 introduces several techniques, such as
a more efficient long-range attention network called Extended-ELAN (E-ELAN), which
enhances the model’s feature extraction capability. Moreover, YOLOv8 introduces new
loss functions, such as VFL Loss and Loss+DFL (Distribution Focal Loss), to improve the
model’s localization accuracy and category differentiation ability. Additionally, YOLOv8
employs new data enhancement methods, including Mosaic + MixUp, to enhance the
generalization and robustness of the model.
In the current field of deep learning models, Faster-RCNN, YOLOv5, YOLOv7, and
YOLOv8 are all target detection methods known for their high accuracy and advanced
algorithms. However, there are some variations in terms of model structure, accuracy,
speed, training strategy, and robustness. The selection of the appropriate algorithm should
be based on specific requirements and application scenarios to effectively address the needs
of UAV road crack target detection.
Appl. Sci. 2024, 14, 1157 8 of 23
H ≥ ( f ∗ W )/Sw (1)
where H represents flight vertical height. f represents the focal length of the camera. W
Appl. Sci. 2024, 14, x FOR PEER REVIEW 9 of 24
represents the width of the road to be inspected. Sw represents the camera sensor size
(Sw × Sh ).
InInour
ourexperiment,
experiment,thetheDJI
DJIMini
Mini2 2drone
dronewaswaschosen
chosentotoperform
performthe theflight
flightmission.
mission.
The
The camera sensor format was CMOS 1/2.3 inches, with a full-frame sensorofsize
camera sensor format was CMOS 1/2.3 inches, with a full-frame sensor size 17.3of
mm × 13.0 mm. The main focal length (f) was 24.0 mm. The experimental
17.3 mm × 13.0 mm. The main focal length (f ) was 24.0 mm. The experimental pave- pavement con-
sisted
mentof a bi-directional
consisted eight-lane road.
of a bi-directional To ensure
eight-lane road.high-definition imagery quality,
To ensure high-definition the
imagery
experiment was conducted only on the left lane, from east to west. The width
quality, the experiment was conducted only on the left lane, from east to west. The width of the road
(W) wasroad
of the measured
(W) was tomeasured
be 16 m. to
Thebeminimum
16 m. Theflight
minimumaltitude was
flight calculated
altitude at 22.20 m.at
was calculated
Taking
22.20 m.intoTaking
account theaccount
into tolerance
thefor flight stability,
tolerance the
for flight final flight
stability, the height was height
final flight chosenwasas
22.5 m. as 22.5 m.
chosen
4.1.2.Ground
4.1.2. GroundSampling
Sampling
TheGround
The GroundSampling
SamplingDistance
Distance(GSD)
(GSD)isisa acrucial
crucialparameter
parameterininremote
remotesensing
sensingand
and
imageprocessing.
image processing. It quantifies
It quantifies thethe distance
distance between
between the individual
the individual pixels
pixels in anin an image
image and
andground
the the ground
truth,truth,
whichwhich directly
directly affects
affects thethe accuracy
accuracy of of geospatialmeasurements
geospatial measurementsforfor
cracks. (i) DJI can officially provide GSD values that are applicable to a wide range of focal
lengths [16]. The most commonly observed GSD value, typically associated with a 24 mm
focal length, is calculated as H/55. (ii) Alternatively, GSD can be derived directly from the
diagram in Figure 5b, using Equation (2):
GSD ≈ (μ ∗ H ) / f (2)
Appl. Sci. 2024, 14, 1157 9 of 23
cracks. (i) DJI can officially provide GSD values that are applicable to a wide range of focal
lengths [16]. The most commonly observed GSD value, typically associated with a 24 mm
focal length, is calculated as H/55. (ii) Alternatively, GSD can be derived directly from the
diagram in Figure 5b, using Equation (2):
GSD ≈ (µ ∗ H )/ f (2)
where GSD represents the ground sampling distance of a flight, and its unit is cm/pixel;
µ is the image pixel calibration size (µm), which can be officially provided by DJI. Take DJI
Mini 2 as an example, where µ is given as 4.4 µm. If the flight height is 22.5 m, thereby
GSD can be computed as 0.4125 cm/pixel.
In our experiment, the road width (W) was chosen as 16.0 m and the GSD was
computed as 0.4125 m/pixel using Equation (2), thereby, the ground-truth road length
(L) in an image was calculated as 38.78 m using Equation (4). Given that the sampling
frequency t was 2 s, the forward overlap of the captured images was set to 75%. According
to the Equation (3), the minimum speed v was 4.85 m/s, and finally, 5 m/s was determined
as the flight velocity for this experiment.
GSD × Fl × (1 − r )
N≤ × f ps (5)
v
where Fl is the frame image size (px), i.e., the flight direction; fps represents the frames per
second in each video; and the other variables are described in the previous section.
Appl. Sci. 2024, 14, 1157 10 of 23
For this experiment, UAV imagery in the DJI Mini 2 was set to 4K HD, which cor-
responds to the DJI official frame image size of 3840 px × 2160 px. Namely, Fl was
taken as 3840 px. The fps was officially 24 f·s−1 , and GSD and v were calculated to be
0.4125 cm/pixel and 5.0 m/s as above, respectively. The overlap (r) was taken as 75%.
Using Equation (5), the extraction interval number (N) was found to be [19.01], which was
rounded to 19. Finally, to ensure sufficient overlap, this study extracted an image every
19 frames from the video. The extracted frame images were then used to stitch together
overlapping parts of neighboring frames using the picture fusion technique.
Figure6.6. Diagram
Figure Diagram of
of trimming
trimmingprocess
processfor
forthe
thelarge
largeframe
frameimage
image(modified
(modifiedfrom
from[10]).
[10]).
Road
Road crack
crack labeling
labeling plays
plays aa crucial
crucial role
role in
in training
training and
and testing
testing deep
deep learning
learning models.
models.
The
Theaccuracy
accuracyofoflabeling
labelingdirectly
directlyimpacts
impactsthe
thequality
qualityofof model
model learning.
learning. In
In this experiment,
this experiment,
we
weemployed
employedvarious
variousmethods,
methods,including
includingmanual
manual visual
visuallabeling and
labeling thethe
and Labelimg
Labelimgtool, to
tool,
decipher, mark, and categorize different types of cracks based on the original
to decipher, mark, and categorize different types of cracks based on the original UAV UAV crack
datasets. The goal
crack datasets. wasgoal
The to create
was anto improved
create an training
improved settraining
for crackset
recognition.
for crack Based on the
recognition.
prominence of cracks and their associated damage hazards, road cracks were
Based on the prominence of cracks and their associated damage hazards, road cracks were categorized
categorized into four types: longitudinal cracks, transverse cracks, diagonal cracks, and
mesh cracks. These categories are illustrated in Table 1.
Table
Table1.
Table
Table 1.Classification
1.
1. Classificationand
Classification
Classification anddescription
and
and descriptionof
description
description ofroad
of
of roadcracks.
road
road cracks.
cracks.
cracks.
Longitudinal
Longitudinal
Longitudinal Crack
Crack
Crack
Longitudinal Crack Transverse
Transverse
Transverse Crack
Transverse Crack
Crack
Crack Oblique
Oblique
Oblique Crack
Oblique Crack
Crack
Crack Alligator
Alligator
Alligator Crack
Crack
Crack
Alligator Crack No-Cracks
No-Cracks
No-Cracks
No-Cracks
(LC)
(LC)
(LC)
(LC) (TC)
(TC)
(TC)
(TC) (OC)
(OC)
(OC)
(OC) (AC)
(AC)
(AC)
(AC) (Other)
(Other)
(Other)
(Other)
Generally,
Generally,
Generally,the the problem
theproblem
problemof of imbalanced
ofimbalanced
imbalancedsamplesample distribution
sampledistribution
distributionin in datasets
indatasets
datasetscan can often
canoften lead
oftenlead
lead
Generally, the problem of imbalanced sample distribution in datasets can often lead to
to
to overfitting
tooverfitting
overfittingof of the
ofthe model
themodel
model[30]. [30]. To
[30].To address
Toaddress
addressthisthis issue,
thisissue, this
issue,this experiment
thisexperiment
experimentfully fully considered
fullyconsidered
consideredthe the
the
overfitting of the model [30]. To address this issue, this experiment fully considered the
balance
balance
balanceof of the
ofthe sample
thesample distribution
sampledistribution
distributionwhenwhen creating
whencreating
creatingthe the labeling
thelabeling datasets.
labelingdatasets.
datasets.Each Each type
Eachtype
typeofof road
ofroad
road
balance of the sample distribution when creating the labeling datasets. Each type of road
pavement
pavement crack
pavement crack had
had aaa more
crack had more equal
more equal number
equal number distribution,
number distribution,
distribution, as as shown
as shown
shown in in Figure
in Figure
Figure 7.7.
7.AA total
A total of
total of
of
pavement crack had a more equal number distribution, as shown in Figure 7. A total of
1388
1388 pavement
1388pavement
pavementcrackcrack images
crackimages
imagesbasedbased
basedon on
onaaaUAV
UAV
UAVwere were collected
werecollected
collectedandand labeled,
andlabeled,
labeled,withwith 304
with304 samples
304samples
samples
1388 pavement crack images based on a UAV were collected and labeled, with 304 samples
identified
identified
identifiedas
Appl. Sci. 2024, 14, x FOR PEER REVIEW
identified as
as being
asbeing
being
beingof of
of the
ofthe
the longitudinal
thelongitudinal
longitudinal
longitudinal crack
crack
crack
crack (LC)(LC)
(LC)
(LC) type,
type,
type,
type, 303 303
303
303 samples
samples
samples
samples identified
identified
identified
identified as
as
as beingbeing
asbeing
being
12of of
of
of
ofthe
24
the
the transverse
transverse crack
crack (TC)
(TC) type,
type, 313
313 samples
samples identified
identified as
as being
being of
of
transverse crack (TC) type, 313 samples identified as being of the obliquely oriented crack the
the obliquely
obliquely oriented
oriented
crack
crack
(OC) (OC)
cracktype,
(OC)
(OC)368type,
type,
type, 368
368
samples samples
368samples
samples
identifiedidentified
identified
identified
as being as
as being
asofbeing
being of
of the
ofthe
the alligator alligator
thealligator
alligator
crack (AC)crack
crack
crack (AC)
(AC)
type,(AC)
and type,
type,
type, and
100 and
and100
samples100
100
being identified
samples as of the no-crack
being identified as of the type. To ensure
no-crack theensure
type. To DL-basedthe model’s
DL-based effectiveness, the
model’s effec-
datasets were divided into training, validation, and test sets in the ratio of 80%, 10%,
tiveness, the datasets were divided into training, validation, and test sets in the ratio ofand
10%, 10%,
80%, respectively.
and 10%, respectively.
Figure
Figure 7. Samples
Samples and
and distribution
distribution of
of the pavement crack datasets.
Notably, existing
Notably, existing crack
crackdatasets
datasetsoften
oftendo donot
notprovide
provideground-truth
ground-truthinformation,
information, partic-
par-
ularly regarding
ticularly regardingthethe
spatial resolution
spatial of UAV
resolution of UAVimagery.
imagery.ThisThis
lacklack
of information directly
of information di-
affectsaffects
rectly the accuracy of crackofidentification
the accuracy and measurement
crack identification in future studies.
and measurement in futureInstudies.
this study,
In
the UAV
this study,data
the collection
UAV dataprocess included
collection processthe recording
included the of the real-time
recording of theflight height,
real-time an
flight
important
height, an parameter
important for each image.
parameter Thereby,
for each theThereby,
image. Ground Sample Distance
the Ground (GSD)
Sample can be
Distance
calculated using Formula (2), and also documented in each treated image,
(GSD) can be calculated using Formula (2), and also documented in each treated image, which is crucial
for theissubsequent
which crucial for automated evaluation
the subsequent of pavement
automated evaluationdamage.
of pavement damage.
Figure8.8.Experimental
Figure Experimental road
road andand scenario.
scenario.
Figure 9. Loss plot of the YOLOv5s model (the optimal iterations number is 200).
Figure 9. Loss plot of the YOLOv5s model (the optimal iterations number is 200).
5.3. Evaluation Metrics of Models
5.3.1. Running Performance
To validate the computational complexity of deep learning models, five evaluation
metrics in this experiment were firstly used to assess the algorithm’s running performance:
the number of parameters, video memory usage, training duration, memory consumption,
and frame rate (FPS). It is important to note that the FPS measures the number of images
processed per second and serves as a significant indicator of prediction speed.
When considering identical datasets, it can be concluded that the Faster-RCNN model
required a superior running performance and environment configuration, whereas the
YOLO model series required a lower hardware and software environment configura-
tion, while offering faster training speeds. Consequently, the YOLO model series algo-
rithms are highly suitable for the lightweight deployment of real-time detection tasks on
UAV platforms.
Table 5. Results of detection accuracy with various models under four crack types (%).
AP (%) F1-Score
Models
TC LC AC OC TC LC AC OC
Faster-RCNN 85.7 83.4 60.2 87.8 82.3 78.0 58.1 82.9
YOLO v5s 75.5 87.4 43.8 89.1 72.3 86.5 43.5 88.0
YOLO v7-tiny 70.4 81.2 40.7 80.7 70.0 79.0 44.8 77.1
YOLO v8s 75.4 89.5 45.4 91.0 74.4 85.0 48.5 90.6
sively.
Appl. Sci. 2024, 14, x FOR PEER REVIEW Additionally, the Faster-RCNN model also exhibited an excellent performance 17 of in
25
exhibited unsatisfactory recognition with low confidence levels,
detecting subtle cracks, as shown in Table 7. For instance, it successfully identified a17 often resulting in the
subtle
Appl.
Appl. Sci.
Sci. 2024,
2024, 14,
14, xx FOR
FOR PEER
PEER REVIEW
REVIEW 17 of
of 25
25
omission
exhibited or
transverse separate
crack within
unsatisfactory identification
a recognition
longitudinalof with
complete
crack.low cracks,
In other whereas
crack
confidence types, the
levels, allFaster-RCNN
modelingmodel
fourresulting
often inalgo-
the
demonstrated
rithms a superior
demonstrated capabilityinof
effectiveness in complete
recognizing oblique cracksthe(OC) more comprehen-
omission
exhibited
sively.
or separate
unsatisfactory
Additionally, the recognition crack
identification
Faster-RCNN with detection.
modellowalso A whereas
cracks, comparative
confidence
exhibited levels,
an
analysis
Faster-RCNN
often
excellent
considering
resulting
performance
model
in thein
both combined
demonstrated
omission or aeffects
superior
separate and confidence
capability
identification in
of levels reveals
recognizing
complete that
oblique
cracks, Faster-RCNN
cracks
whereas (OC)
the achieved
more
Faster-RCNN the
comprehen- best
model
exhibited
detecting
overall unsatisfactory
subtle cracks,
performance;
sively. Additionally, among recognition
theasFaster-RCNN
shown
the YOLO with
in Table 7.low
series
model For confidence
instance,
algorithms,
also levels, often
it successfully
YOLOv5s
exhibited and
an excellent resulting
identified
YOLOv8s a in the
subtle
showed
performance in
demonstrated
omission
transverse orcracka superior
separate
within a capability
identification
longitudinal in complete
of recognizing
crack. In oblique
cracks,
other crack cracks
whereas
types, (OC)
the
all more
Faster-RCNN
four comprehen-
modeling model
algo-
comparable
detecting results,
subtle while
cracks, as YOLOv7-tiny
shown in Tableperformed
7. For relatively
instance, it poorly, withidentified
successfully lower confidence
a subtle
sively. Additionally,
demonstrated a across
superiorthe Faster-RCNN model also oblique
exhibited an excellent performance in
rithms
levels demonstrated
observed
transverse crack within alla capability
effectiveness
detected inincrack
recognizing
results.
longitudinal detection.
crack. other A
In instance,
crack cracks
types, (OC)
comparative more
all analysis
four comprehen-
considering
modeling algo-
detecting
sively.
both subtle
combined cracks,
Additionally,
effectsthe as
and shown in
Faster-RCNN
confidence Table 7.
model
levels For it
also exhibited
reveals that successfully
an excellent
Faster-RCNN identified a
performance
achieved subtle
the best in
rithms demonstrated
transverse crackcracks,
within effectiveness
a longitudinalin crack
crack.detection. A
In instance,
other comparative
crack types, all analysis
four considering
modeling algo-
detecting
overall
Table subtle
performance;
6. Comparison
both combined as
among
of model shown
effectseffectiveness in
the
valuation
and confidence Table
YOLO
with 7. For
series
various
levels UAV it
algorithms,
cracks
reveals that successfully
YOLOv5s
datasets.
Faster-RCNN identified
and
achieved a subtle
YOLOv8s
the best
rithms
showed demonstrated
transverse crack
comparable within a
results, while in
longitudinal crack
crack.
YOLOv7-tinydetection.
In other A comparative
crack
performed types, all
relatively analysis
four considering
modeling
poorly, with algo-
lower
overall
both
Faster-RCNN performance;
combined effects among
and
YOLO the YOLO
confidence
v5s levelsseries
reveals
YOLOalgorithms,
that YOLOv5s
Faster-RCNN
v7-Tiny and v8s
achieved
YOLO YOLOv8s
the best
rithms demonstrated
confidence
showed levels
comparable effectiveness
observed
results, across
while in
all crack detection.
detected results. A comparative analysis considering
Datasets FPS overall F1 performance;
mAP FPS among F1 the YOLOv7-tiny
YOLO
mAP FPS performed
series algorithms,
F1 relatively
mAP YOLOv5sFPSpoorly,
andF1with lower
YOLOv8s
mAP
(f.s−1 ) both(%)combined
confidence levels
(%) effects
(f.s−1 )and confidence
observed across
(%) all levels
(f.sreveals
detected
(%) −results.
1) that Faster-RCNN
(%) (%) (f.s−1achieved
) (%) the(%) best
showed
Table
overall comparableofresults,
7. Comparison
performance; detection
among while YOLOv7-tiny
results
the with various
YOLO series performed
models.
algorithms, relatively
YOLOv5s poorly,
and with lower
YOLOv8s
UAPD [2] 9.14 confidence
47.9 48.8 59.7
levels observed 52.7
across 57.7
all detected74.51 56.7
results. 52.8 65.4 57.4 58.6
RDD2022 [31] 11.36 showed
Table comparable
69.57. Comparison
68.8 ofresults,
63.21 while
65.2
detection YOLOv7-tiny
60.9
results 65.47 performed
with various 63.1
models. relatively
65.6 poorly,
53.71 with lower
66.5 67.7
UMSC [19] 11.72 confidence
73.4 68.8
levels 97.87
observed 68.7
across 74.3
all detected76.81 63.8
results. 70.1 89.78 72.8 70.4
Input Images
UAVRoadCrack [21] 10.57 Faster-RCNN
68.97. Comparison
Table 68.5 108.6 YOLOv5s
of detection77.8results
75.7 75.39
with various YOLOv7-Tiny
62.5
models. 65.3 69.36 YOLOv8s
71.0 68.8
CrackForest [32] / 57.4 59.1 / 57.8 58.8 67.45 61.2 63.5 61.21 60.9 65.2
Our Input
DatasetsImages 12.80 Faster-RCNN
75.37.
Table
Table 79.3
7. Comparison
Comparison127.4
of YOLOv5s
72.6results
of detection
detection 74.0
results with YOLOv7-Tiny
82.56
with various
various 66.7
models.
models. 65.5 125.7 YOLOv8s
75.0 77.1
Input Images Faster-RCNN YOLOv5s YOLOv7-Tiny YOLOv8s
Table 7. Comparison of detection results with various models.
Input Images Faster-RCNN YOLOv5s YOLOv7-Tiny YOLOv8s
Input Images Faster-RCNN YOLOv5s YOLOv7-Tiny YOLOv8s
Appl. Sci. 2024, 14, 1157 17 of 23
Figure
Figure 10.
10. Illustration
Illustration of
of “Divide
“Divide and
and Merge
Merge strategy”
strategy” of UAV imagery for crack detection.
6.1. Measurement
6.1. Measurement Methods
Methods of
of Pavement
Pavement Cracks
Cracks
The measurement
The measurement methods
methods forforcrack
crackanalysis
analysisplay
playa acrucial
crucialrole inin
role statistically analyz-
statistically ana-
ing the quantity of cracks. These methods consider various factors, such as crack
lyzing the quantity of cracks. These methods consider various factors, such as crack loca-location,
crack type, crack length, crack width, crack depth, and crack area. In order to improve the
tion, crack type, crack length, crack width, crack depth, and crack area. In order to im-
prove the practicality of these methods in road damage maintenance, the quantity of
cracks can be roughly estimated, temporarily excluding small cracks.
(i) Pavement Crack Location: The pixel position of the detected crack in the original
UAV imagery can be determined based on the corresponding image number; mean-
while, the actual ground position can be inferred through GSD calculation.
(ii) Pavement Crack Length: This can be determined based on the pixel size of the confi-
Appl. Sci. 2024, 14, 1157 18 of 23
practicality of these methods in road damage maintenance, the quantity of cracks can be
roughly estimated, temporarily excluding small cracks.
(i) Pavement Crack Location: The pixel position of the detected crack in the original UAV
Appl. Sci. 2024, 14, x FOR PEER REVIEW
imagery can be determined based on the corresponding image number; meanwhile,
the actual ground position can be inferred through GSD calculation.
(ii) Pavement Crack Length: This can be determined based on the pixel size of the
confidence frame model, as illustrated in Figure 11. Horizontal cracks are measured
lengths;
by diagonal
their horizontal cracks
border pixel by estimated
lengths; border
vertical cracks by diagonal distance
their vertical pixels; an
border pixel
cracks diagonal
lengths; primarily by measuring
cracks by estimatedborder
border pixel areas.
diagonal distance pixels; and mesh
(iii) cracks
Pavementprimarily by measuring
Crack Width: The border pixel areas.width of a crack can be determined
maximum
(iii) Pavement
tifying the Crack Width:with
region The maximum
the highestwidth of a crack can be
concentration of determined
extracted by identi-
crack pixels.
fying the region with the highest concentration of extracted crack pixels.
(iv) Pavement Crack Area: This mainly aims at alligator cracks (AC), with a measu
(iv) Pavement Crack Area: This mainly aims at alligator cracks (AC), with a measurement
ofofthe
the crack
crack area.
area. It canItbe
can be calculated
calculated by the
by the pixels pixels
of AC basedofonACthe based on the con
confidence
frame model.
frame model.
Figure11.11.
Figure Diagram
Diagram of crack
of crack measurements.
measurements.
Finally, to determine the location, length (L), and area (A) of road cracks with ground
Finally, to determine the location, length (L), and area (A) of road cracks with
truth, the quantitative results of cracks can be determined by multiplying the ground
truth,
sampling the quantitative
distance results
(GSD, Unit: of cracks
cm/pixel) by thecan beatdetermined
pixels which they areby multiplying
located. the grou
The actual
plingordistance
length (GSD,
width of the crackUnit: cm/pixel)
in meters by the pixels
can be calculated at which
as the pixel they
length (m) are located. Th
× GSD/100,
while
lengththeor
actual
widtharea
ofofthe
thecrack
block in
affected
metersby can
the crack in square meters
be calculated as the can be length
pixel derived (m) × G
from the pixel area (m2 ) × GSD2 /1002 .
while the actual area of the block affected by the crack in square meters can be
from
6.2. the pixel
Evaluation areaof(m
Methods
2) × GSD2/1002.
Pavement Distress
The evaluation of pavement damage can be determined using the internationally
6.2. Evaluation
recognized Methods
pavement of Pavement
damage Distress
index (PCI), which is also adopted in China. The PCI
provides a crucial indicator for assessing the level of pavement integrity. Additionally,
The evaluation of pavement damage can be determined using the internation
the pavement damage rate (DR) represents the most direct manifestation and reflection
ognized pavement damage index (PCI), which is also adopted in China. The PCI p
a crucial indicator for assessing the level of pavement integrity. Additionally, th
ment damage rate (DR) represents the most direct manifestation and reflectio
physical properties related to the pavement condition. In this study, we refer to sp
tions such as ‘Technical Code of Maintenance for Urban Road (CJJ36-2016)’ [33] an
Appl. Sci. 2024, 14, 1157 19 of 23
of the physical properties related to the pavement condition. In this study, we refer to
specifications such as ‘Technical Code of Maintenance for Urban Road (CJJ36-2016)’ [33] and
‘Highway Performance Assessment Standards (DB11/T1614-2019)’ [34] from the Chinese
government, incorporating their respective calculation formulas as follows:
N
DR = 100 × ∑ wi Ai /A (6)
i =1
Figure12.
Figure 12.Visulization
Visulization results
results of
of crack
crack detection
detectionand
andpavement
pavementdistress
distressevaluation.
evaluation.
7.7.Discussion
Discussion
This study
This study proposes
proposesaacomprehensive
comprehensive and systematic
and framework
systematic frameworkand method for au-for
and method
tomatic crack
automatic crackdetection and
detection and pavement
pavementdistress
distressevaluation
evaluationin in
a UAV
a UAV inspection system.
inspection system.
Theframework
The framework begins
begins by
by establishing
establishing the
the flight
flight parameter
parametersettings
settingsand
andexperimental
experimental
techniques to
techniques to enhance
enhance the high-quality
high-quality imagery
imageryusingusingthe
theDJI
DJIMin2
Min2drone inin
drone real-world
real-world
scenarios. Additionally, a benchmark dataset was created and has been made
scenarios. Additionally, a benchmark dataset was created and has been made available available to
the community. The dataset includes important information such as the
to the community. The dataset includes important information such as the GSD, which GSD, which is is
essential for evaluating pavement distress. In this experiment, our self-made crack dataset
demonstrated its superiority compared to existing datasets used in similar algorithms,
achieving the highest accuracy in crack recognition and algorithmic efficiency. The exper-
imental result (refer to Table 6) revealed the significance of data acquisition quality in the
accuracy of crack target recognition, with high-quality image data from the UAV imagery
effectively improving the recognition accuracy.
Appl. Sci. 2024, 14, 1157 20 of 23
essential for evaluating pavement distress. In this experiment, our self-made crack dataset
demonstrated its superiority compared to existing datasets used in similar algorithms,
achieving the highest accuracy in crack recognition and algorithmic efficiency. The experi-
mental result (refer to Table 6) revealed the significance of data acquisition quality in the
accuracy of crack target recognition, with high-quality image data from the UAV imagery
effectively improving the recognition accuracy.
In this experiment, the detection capability for road cracks in a UAV inspection system
could be enhanced through a range of strategies. Firstly, adhering to a drone flight control
strategy ensured a consistent high and stable speed during data acquisition on urban
roads. This guaranteed the collection of clear and high-quality drone images with attached
real spatial scale information for distress assessments. Secondly, the sampling ‘divide
and conquer’ strategy for model training and target detection involves various key steps,
including ‘the frame extracting from video and image cropping for large image’ and ‘model
learning and crack detection for small images’, as well as ‘fusion and splicing from small
images’. This approach effectively improves the accuracy of identifying cracks in large-
scale images while enhancing the operational efficiency of these models. Thirdly, the
deployment of drone detection algorithms using both ‘online–offline’ and ‘online–online’
strategies provides flexibility based on different scenarios. The ‘one-stage’ algorithm
operates quickly, but has a lower detection accuracy, whereas the ‘two-stage’ algorithm
exhibits a slower running efficiency but a higher detection accuracy. These deep learning
models can be deployed accordingly, depending on the specific application scenarios. For
instance, in sudden situations requiring fast real-time detection, lightweight deployment
using a ‘two-stage’ algorithm such as YOLO series models can be employed.
To propose a suitable deployment scheme for the UAV inspection system, this study
utilized prominent deep learning algorithms, namely Faster-RCNN, YOLOv5s, YOLOv7-
tiny, and YOLOv8s, for pavement crack object detection and a comparative analysis. The
results revealed that Faster-RCNN demonstrated the best overall performance, with a
precision (P) of 75.6%, a recall (R) of 76.4%, an F1-score of 75.3%, and a mean Average
Precision (mAP) of 79.3%. Moreover, the mAP of Faster-RCNN surpassed that of YOLOv5s,
YOLOv7-tiny, and YOLOv8s by 4.7%, 10%, and 4%, respectively. This indicates that Faster-
RCNN outperformed in terms of detection accuracy, but required a higher environment
configuration, making it suitable for online data collection using a UAV and offline in-
spection at work stations. On the other hand, the YOLO serial models, while slightly less
accurate, were the fastest algorithms and are suitable for the lightweight deployment of
UAVs with online collection and real-time inspection. Many studies have also proposed re-
fined YOLO-based algorithms for crack detection in drones, mainly due to their lightweight
deployment in UAV systems. For instance, the BL-YOLOv8 model [20] reduces both the
number of parameters and computational complexity compared to the original YOLOv8
model and other YOLO serial models. This offers the potential to directly deploy the YOLO
serial models on cost-effective embedded devices or mobile devices.
Finally, road crack measurement methods are presented to assess road damage, which
will enhance the application of the UAV inspection system and provide factual evidence
for the maintenance decisions made by road authorities. Notably, a crack is a significant
indicator for evaluating road distress. In this study, the evaluation results were primarily
obtained through a comprehensive assessment of the crack area, degree of damage, and
their proportions. However, relying solely on cracks to determine road distress may be
deemed limited, and this should only be considered as a reference for the relevant road
authorities. Therefore, it is essential to conduct a comprehensive evaluation that takes into
account multiple factors, such as rutting and potholes.
8. Conclusions
The traditional manual inspection of road cracks is inefficient, time-consuming, and
labor-intensive. Additionally, using multifunctional road inspection vehicles can be ex-
pensive. However, the use of UAVs equipped with high-resolution vision sensors offers
Appl. Sci. 2024, 14, 1157 21 of 23
a solution. These UAVs can remotely capture and display images of the pavement from
high altitudes, allowing for the identification of local damages such as cracks. The UAV
inspection system, which is based on the commercial DJI Min2 drone, offers several advan-
tages. It is cost-effective, non-contact, highly precise, and enables remote visualization. As
a result, it is particularly well-suited for remote pavement detection. In addition, automatic
crack detection technology based on deep learning models brings significant additional
value to the field of road maintenance and safety. It can be integrated into the commercial
UAV system, thereby reducing the workload of maintenance personnel.
In this study, the contributions are summarized as follows: (1) A pavement crack
detection and evaluation framework of a UAV inspection system based on deep learn-
ing was proposed and can provide technical guidelines for road authorities. (2) To en-
hance automatic crack detection capability and design a suitable scheme for implementing
deep-learning-based models in a UAV inspection system, we conducted a validation and
comparative study on prevalent deep learning algorithms for detecting pavement cracks
in urban road scenarios. The study demonstrates the robustness of these algorithms in
terms of their performance and accuracy, as well as their effectiveness in handling our
customized crack image datasets and other popular crack datasets. Furthermore, this
research provides recommendations for leveraging UAVs in deploying these algorithms.
(3) Quantitative methods for road cracks were proposed and pavement distress evaluations
were also carried out in our experiment. Obviously, our final evaluation results were also
guaranteed according to GSD. (4) A pavement crack image dataset integrated with GSD
was established and has been made publicly available for the research community, serving
as a valuable supplement to existing crack databases.
In summary, the UAV inspection system, under the guidance of our proposed frame-
work, has been proven to be feasible, yielding more satisfactory results. However, drone
inspection has the inherent limitation of a limited battery life, making it difficult to perform
long-distance continuous road inspection tasks. Drones are better suited for short-distance
inspections in complex urban scenarios [16]. With advancements in drone and vision
computer technology, drones equipped with lightweight sensors and these lightweight
crack detection algorithms are expected to gain popularity for road distress inspection.
In the future, this study aims to incorporate improved YOLO algorithms into the UAV
inspection system to enhance road crack recognition accuracy. Furthermore, in order to
establish a comprehensive UAV inspection system for road distress, we plan to continue
researching multi-category defect detection systems in the future, including various road
issues such as rutting and potholes, among which are cracks. Additionally, efforts will be
made to enhance UAV flight autonomy for stability and high-speed aerial photography,
further improving the quality of aerial images and catering to the requirements of various
complex road scenarios.
Author Contributions: Conceptualization, X.C., C.L. and L.C.; methodology, X.C. and L.C.; software,
C.L. and L.C.; validation, C.L., X.Z. and L.C.; formal analysis, X.C., X.Z. and Y.Z.; investigation, L.C.,
C.L., C.W. and Y.Z.; resources, X.C. and C.L.; data curation, C.L. and C.W.; writing—original draft
preparation, X.C., C.L. and Y.Z.; writing—review and editing, X.C. All authors have read and agreed
to the published version of the manuscript.
Funding: This research was funded by China Postdoctoral Science Foundation (2017M622577), Hunan
Provincial Natural Science Foundation (2018JJ2118). Chinese national college students innovation
and entrepreneurship training program (S202310534031, S202310534169).
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The UAV crack dataset presented in this study are openly available in
FigShare at 10.6084/m9.figshare.25103138.
Acknowledgments: The authors would like to express many thanks to all the anonymous reviewers.
Appl. Sci. 2024, 14, 1157 22 of 23
References
1. Samadzadegan, F.; Dadrass Javan, F.; Hasanlou, M.; Gholamshahi, M.; Ashtari Mahini, F. Automatic Road Crack Recognition
Based on Deep Learning Networks from UAV Imagery. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2023, X-4/W1-2022,
685–690. [CrossRef]
2. Zhu, J.; Zhong, J.; Ma, T.; Huang, X.; Zhang, W.; Zhou, Y. Pavement distress detection using convolutional neural networks with
images captured via UAV. Autom. Constr. 2022, 133, 103991. [CrossRef]
3. Cao, J.; Yang, G.T.; Yang, X.Y. Pavement Crack Detection with Deep Learning Based on Attention Mechanism. J. Comput. Aided
Des. Comput. Graph. 2020, 32, 1324–1333.
4. Qi, S.; Li, G.; Chen, D.; Chai, M.; Zhou, Y.; Du, Q.; Cao, Y.; Tang, L.; Jia, H. Damage Properties of the Block-Stone Embankment in
the Qinghai–Tibet Highway Using Ground-Penetrating Radar Imagery. Remote Sens. 2022, 14, 2950. [CrossRef]
5. Guo, S.; Xu, Z.; Li, X.; Zhu, P. Detection and Characterization of Cracks in Highway Pavement with the Amplitude Variation of
GPR Diffracted Waves: Insights from Forward Modeling and Field Data. Remote Sens. 2022, 14, 976. [CrossRef]
6. Salman, M.; Mathavan, S.; Kamal, K.; Rahman, M. Pavement crack detection using the Gabor filter. In Proceedings of the 16th
international IEEE Conference on Intelligent Transportation Systems (ITSC 2013), The Hague, The Netherlands, 6–9 October 2013;
pp. 2039–2044. [CrossRef]
7. Ayenu-Prah, A.; Attoh-Okine, N. Evaluating pavement cracks with bidimensional empirical mode decomposition. EURASIP J.
Adv. Signal Process. 2008, 2008, 861701. [CrossRef]
8. Majidifard, H.; Adu-Gyamfi, Y.; Buttlar, W.G. Deep machine learning approach to develop a new asphalt pavement condition
index. Constr. Build. Mater. 2020, 247, 118513. [CrossRef]
9. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017,
60, 84–90. [CrossRef]
10. Tran, V.P.; Tran, T.S.; Lee, H.J.; Kim, K.D.; Baek, J.; Nguyen, T.T. One stage detector (RetinaNet)-based crack detection for asphalt
pavements considering pavement distresses and surface objects. J. Civ. Struct. Health Monit. 2021, 11, 205–222. [CrossRef]
11. Xiao, L.Y.; Li, W.; Yuan, B.; Cui, Y.Q.; Gao, R.; Wang, W.Q. Pavement Crack Automatic Identification Method Based on Improved
Mask R-CNN Model. Geomat. Inf. Sci. Wuhan Univ. 2022, 47, 623–631. [CrossRef]
12. Xu, K.; Ma, R.G. Crack detection of asphalt pavement based on improved faster RCNN. Comput. Syst. Appl. 2022, 31, 341–348.
[CrossRef]
13. Xu, X.; Zhao, M.; Shi, P.; Ren, R.; He, X.; Wei, X.; Yang, H. Crack detection and comparison study based on faster R-CNN and
mask R-CNN. Sensors 2022, 22, 1215. [CrossRef]
14. Yan, K.; Zhang, Z. Automated asphalt highway pavement crack detection based on deformable single shot multi-box detector
under a complex environment. IEEE Access 2021, 9, 150925–150938. [CrossRef]
15. Yokoyama, S.; Matsumoto, T. Development of an automatic detector of cracks in concrete using machine learning. Procedia Eng.
2017, 171, 1250–1255. [CrossRef]
16. Jiang, Y.T.; Yan, H.T.; Zhang, Y.R.; Wu, K.Q.; Liu, R.Y.; Lin, C.Y. RDD-YOLOv5: Road Defect Detection Algorithm with Self-
Attention Based on Unmanned Aerial Vehicle Inspection. Sensors 2023, 23, 8241. [CrossRef] [PubMed]
17. Zhang, Y.; Zuo, Z.; Xu, X.; Wu, J.; Zhu, J.; Zhang, H.; Wang, J.; Tian, Y. Road damage detection using UAV images based on
multi-level attention mechanism. Autom. Constr. 2022, 144, 104613. [CrossRef]
18. Zhou, Q.; Ding, S.; Qing, G.; Hu, J. UAV vision detection method for crane surface cracks based on Faster R-CNN and image
segmentation. J. Civ. Struct. Health Monit. 2022, 12, 845–855. [CrossRef]
19. Xiang, X.; Hu, H.; Ding, Y.; Zheng, Y.; Wu, S. GC-YOLOv5s: A Lightweight Detector for UAV Road Crack Detection. Appl. Sci.
2023, 13, 11030. [CrossRef]
20. Wang, X.; Gao, H.; Jia, Z.; Li, Z. BL-YOLOv8: An Improved Road Defect Detection Model Based on YOLOv8. Sensors 2023,
23, 8361. [CrossRef] [PubMed]
21. Omoebamije, O.; Omoniyi, T.M.; Musa, A.; Duna, S. An improved deep learning convolutional neural network for crack detection
based on UAV images. Innov. Infrastruct. Solut. 2023, 8, 236. [CrossRef]
22. Zhao, Y.; Zhou, L.; Wang, X.; Wang, F.; Shi, G. Highway Crack Detection and Classification Using UAV Remote Sensing Images
Based on CrackNet and CrackClassification. Appl. Sci. 2023, 13, 7269. [CrossRef]
23. Liu, K. Learning-based defect recognitions for autonomous uav inspections. arXiv 2023, arXiv:2302.06093v1.
24. Zou, Z.; Chen, K.; Shi, Z.; Guo, Y.; Ye, J. Object detection in 20 years: A survey. Proc. IEEE 2023, 111, 257–276. [CrossRef]
25. Bubbliiiing. Faster-RCNN-PyTorch[CP]. 2023. Available online: https://github.com/bubbliiiing/faster-rcnn-pytorch (accessed
on 26 January 2024).
26. UItralyics. YOLOv5[CP]. 2020. Available online: https://github.com/ultralytics/yolov5 (accessed on 26 January 2024).
27. Wong, K.Y. YOLOv7[CP]. 2023. Available online: https://github.com/WongKinYiu/yolov7 (accessed on 26 January 2024).
28. Ultralytics. YOLOv8[CP]. 2023. Available online: https://github.com/ultralytics/ultralytics (accessed on 26 January 2024).
29. Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural
Inf. Process. Syst. 2015, 28, 1137–1149. [CrossRef]
Appl. Sci. 2024, 14, 1157 23 of 23
30. Buda, M.; Maki, A.; Mazurowski, M.A. A systematic study of the class imbalance problem in convolutional neural networks.
Neural Netw. 2018, 106, 249–259. [CrossRef]
31. Sami, A.A.; Sakib, S.; Deb, K.; Sarker, I.H. Improved YOLOv5-Based Real-Time Road Pavement Damage Detection i-n Road
Infrastructure Management. Algorithms 2023, 16, 452. [CrossRef]
32. Faramarzi, M. Road damage detection and classification using deep neural networks (YOLOv4) with smartphone images.
SSRN 2020. [CrossRef]
33. CJJ36-2016; Technical Code of Maintenance for Urban Road. Ministry of Housing and Urban-Rural Development of the People’s
Pepublic of China: Beijing, China, 2017. Available online: https://www.mohurd.gov.cn/gongkai/zhengce/zhengcefilelib/2017
02/20170228_231174.html (accessed on 10 May 2023).
34. JTG 5210-2018; Highway Performance Assessment Standards. Ministry of Transport of the People’s Republic of China:
Beijing, China, 2018. Available online: https://xxgk.mot.gov.cn/2020/jigou/glj/202006/t20200623_3313114.html (accessed on
10 May 2023).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
Reproduced with permission of copyright owner. Further reproduction
prohibited without permission.