j2
j2
https://doi.org/10.1007/s00521-024-09638-6(0123456789().,-volV)(0123456789().
,- volV)
ORIGINAL ARTICLE
Received: 20 February 2023 / Accepted: 21 February 2024 / Published online: 27 March 2024
Ó The Author(s) 2024
Abstract
Rapid detection of damages occurring as a result of natural disasters is vital for emergency response. In recent years,
remote sensing techniques have been commonly used for the automatic categorization and localization of such events using
satellite images. Trained based on natural disaster images, a convolutional neural network (CNN) has been applied as a
highly successful method, with its ability to reveal outstanding features. Studies aiming to detect target points obtained as a
result of extracting visual features from natural images within these networks have achieved their goals. In this study,
ensemble learning methods have been suggested as a means to develop the detection of landslide areas from landslide
satellite images. Landslide image dataset has been trained for their categorization in CNN models and then they have been
used again to localize landslide regions. While model predictions develop overall performance and status, different
ensemble strategies have been used and integrated to reduce the sensitivity to prediction variance and training data. Class-
selective relevance mapping (CRM) has been used to visualize individual CNN models and ensemble learned behaviors.
As a result of the comparisons made based on mean average precision metrics and the criteria of intersection over union,
model ensembles have proved to show higher localization performance than any other individual model.
1 Introduction
123
10762 Neural Computing and Applications (2024) 36:10761–10776
object-based landslide detection with conventional This study has set out to classify landslide and non-
machine learning methods (Logistic Regression, Support landslide images as well as to localize landslide areas by
Vector Machines, Random Forest, Discrete AdaBoost, the DL method. A variety of pre-trained models including
LogitBoost, Gentle AdaBoost, Convolutional Neural Net- CNN, VGG 16, VGG-19 [31], Inception-V3 [32], Xception
work). Deep learning (DL) algorithms based on data using [33], MobileNet [34], DenseNet-121 and NASNet-mobile
convolutional neural network (CNN) have successfully [35] have been trained on a set of large-scale [16] landslide
been applied for landslide detection [9–11] and thus have data for analyses. CNN predictions have been combined
increased the interest in automatic landslide detection [12]. with various ensemble strategies such as majority vote,
Different DL and visualization techniques have been average, weighted average, and stacking to reduce the
developed to localize landslide regions [13–16]. The CNN prediction variance of training data and learning algorithms
is one of the most widely used DL methods for landslide and improve the overall performance. Moreover, visual-
detection [12]. Shi et al. [17] have come up with a method ization techniques have been applied to interpret important
combining CNN and change perception for faster detection characteristics contributing to the classification of landslide
of landslides with remote sense images. It has been pointed images. Learned behaviors of the individual models and
out that improvements in the speed of landslide detection their ensembles have been visualized by the CRM method.
have been observed thanks to this approach. Transfer This method is the first study to suggest a combination of
learning methods based on DL are frequently used to detect knowledge transfer based on landslide images and
natural disasters with satellite images as a remote sensing ensemble learning as well as to evaluate the localization of
technique. For this approach, CNN must be trained for the regions of interest (ROI) in landslide areas.
target objects in the images and the learned knowledge is
transferred and reused for the proceeding tasks [18–20].
Catani [21] used four pre-trained CNN algorithms (Goo- 2 Material and methods
gLeNet, GoogLeNet-Places365, ResNet.101, and Incep-
tion.V3) to detect landslides from photographs. To detect 2.1 Data collection and preprocessing
landslide locations within large-scale satellite images, an
object detection algorithm called Faster-RCNN was trained In this study, an open source named Bijie landslide dataset
by Li et al. [22]. Researchers have suggested and visualized has been used to automatically detect landslides by the DL
the bounding boxes for each landslide location. In studies method [16]. The scope of the research covers an area of 26
using Mask R-CNN [23, 24], experiments were conducted 853 km2 in the city of Bijie in Guizhou State of China.
using remote sensing methods and landslide-inducing Types of landslides in the city of Bijie consist of rockfalls
information, and satisfactory results were obtained. In and a few debris slides. The dataset imaged by the Tri-
another study [25] dealing with the reliability in the pleSat satellite between May and August 2018 has been
detection of landslide areas by the trained CNN models obtained from 770 landslide images (red dots, Fig. 1) and
(Resnet-50, VGG-19, Inception, and Xception), researchers 2003 non-landslide images. The images of landslide and
have compared such visualization techniques as Grad- non-landslide image samples have been provided in the
CAM, Grad-CAM ? ? , and Score-CAM. It has been png extension. The spatial resolution of the air image is
shown that VGG-19 has over 90% potential and Grad- 0.8 m. An image sample from the Bijie landslide dataset,
CAM and Score-CAM techniques have proved effective in which is available at http://study.rsgis.whu.edu.cn/pages/
the localization of landslide areas. In addition to studies download/, has been presented in Fig. 2.
examining CNN performance [26] to develop a model that
can automatically detect landslides from image streams on
social media, pixel-based landslide detection studies are
also increasing. In these studies, DL models called DemDet
[27] and SFCNet [28] are proposed, as well as pixel, sub-
pixel, and object-based image analysis techniques are
compared for landslide detection [29]. Images of natural
disasters, especially landslides, exhibit different visual
characteristics such as color, texture, shape, image, and
their combinations. Therefore, the application of DL in
landslide detection primarily focuses on image analysis
[30]. It is inevitable to apply performance-developing and
innovative methods apart from available ones in the phase
of the training of these images. Fig. 1 Display of landslide areas [16]
123
Neural Computing and Applications (2024) 36:10761–10776 10763
123
10764 Neural Computing and Applications (2024) 36:10761–10776
have been provided to the insufficiently represented classes [15, 17]. With this method, training of Convolutional
to prevent model bias and overfitting [41]. Callbacks have Neural Network structures with less data is provided
been used to control the situation of the models in the effectively.
training period. Checkpoints for each epoch have been In the study, first, the CNN model and then the transfer
stored as files with the.h5 extension and the early stopping learning model were used for classification on the same
process has been applied to prevent overfitting. The best data set. While an 80% (± 2) success rate was achieved
model weights have been stored in memory to perform with CNN, a 95% (± 1) success rate was obtained with the
hold-out testing. Performance criteria such as accuracy, an transfer learning model. The results are important in terms
area under the curve (AUC), sensitivity, specificity, F of showing that the transfer learning approach is useful.
measure, and Matthews correlation coefficient (MCC) have Figure 5 depicts a transfer learning architecture.
been used for the models in transfer learning and ensemble Because the lowest layers are configured as non-train-
learning. CUDA/CUDNN libraries and Keras API with able, the weights of the pre-trained models are not lost in
TensorFlow backend have been used for GPU acceleration. the proposed study. The fully connected layer is replaced
Models have been trained and evaluated on Windows 11 by a global average pooling layer in the final convolutional
software with 32 GB RAM and NVIDIA Quadro RTX layer, which takes the average of each feature map and
4000 GPU.
123
Neural Computing and Applications (2024) 36:10761–10776 10765
outputs a feature map for each associated class. The flat- learned to represent features from a diverse set of images,
tened feature map is passed through a dense layer, a including animals, office supplies, and other objects. The
dropout layer, and another dense layer before being image input size for this network is 224 9 224. The VGG-
transmitted through the Softmax layer [42]. For a multi- 19 architecture is illustrated in Fig. 7.
class classification problem, categorical cross entropy is VGG-16 and VGG-19 are traditional deep Convolu-
used. All of the pre-trained models are trained for 30 tional Neural Network (CNN) models with a deep archi-
epochs with a batch size of 32 using the Adam optimizer tecture. The features extracted by these middle layers tend
and have a learning rate of 10–4. to recognize low-level visual features such as edges, cor-
ners, simple patterns, and more complex object parts. The
2.5 CNN architectures for pre-trained models commonly used activation function is ReLU, which
enhances positive features and aids in learning. These
This study involved the classification of landslide images models perform well in object recognition and classifica-
using different CNN architectures. The networks utilized tion tasks.
for analysis are VGG-16, VGG-19 [31], Inception-V3 [32],
Xception [43], DenseNet-121 [44], MobileNet [45] and 2.5.3 Inception-V3 model
NASNet-Mobile [46], each offering a different approach to
the object classification tasks at hand as detailed subse- The Inception architecture, developed by [32], presents a
quent subsections. distinctive characteristic that sets it apart from other deep
Each model’s middle layers capture specific features and networks such as VGGNet [31] and AlexNet [47]. Namely,
representations aligned with their tasks and designs. These Inception avoids the use of large convolutions, which are
layers shape the information processing capabilities of each computationally expensive, despite their efficacy in mod-
model and influence how they perform in a particular task eling the interactions between distant activation points. As
or application. illustrated in Fig. 8, Inception-V3 architecture boasts a
unique structure that enables the network to achieve state-
2.5.1 VGG-16 model of-the-art performance in various computer vision tasks.
Inception-V3 has a unique architecture that efficiently
VGG-16 [31] is a state-of-the-art DL model pre-trained models interactions between distant activation points
with over 1 million images from the ImageNet database. It without using large convolutions. Reviews indicate that
classifies objects into 1000 categories using 16 concate- Inception-V3 performs well in various computer vision
nated layers of convolution and maximum pool layers. The tasks. The middle layers tend to capture complex features
model is optimized for 224 9 224 image input size and and semantic concepts with less computational cost.
accurately predicts with Softmax activation. Its 138 million
parameters make it a powerful tool for capturing complex 2.5.4 Xception model
features, although it requires extensive computational
resources. VGG-16 is widely used in computer vision The Xception [43] model, which is an extension of the
applications and AI. The image input size of the network is Inception architecture, was introduced by Google. It has 71
224 9 224. VGG-16 architecture is illustrated in Fig. 6. layers and is a convolutional neural network architecture
that uses depth-wise separable convolutions. The modified
2.5.2 VGG-19 model deeply separable convolution in the Xception architecture
has been found to improve performance compared to
VGG-19 [31] is a CNN model with 19 layers that uses a InceptionV3 for both ImageNet ILSVRC and JFT datasets.
small convolution kernel. This network can also be loaded The architecture of Xception is depicted in Fig. 9.
with a pre-trained version trained on over one million Xception is an extension of the Inception architecture
images from the ImageNet database, enabling it to classify that utilizes depth-wise separable convolutions. The middle
images into 1000 object categories. The network has layers incorporate a modified separable convolution to
123
10766 Neural Computing and Applications (2024) 36:10761–10776
123
Neural Computing and Applications (2024) 36:10761–10776 10767
improve performance compared to InceptionV3. They tend detection while having low computational requirements.
to capture the second large set of features more efficiently. The middle layers tend to represent important features
efficiently in these lightweight models.
2.5.5 DenseNet-121 model
2.5.7 NASNet-mobile model
DenseNet-121 [44], short for Densely Connected Convo-
lutional Networks-121, is a CNN architecture designed for NASNet-Mobile [46], short for Neural Architecture Search
image classification tasks. It is part of the DenseNet family Network-Mobile (Fig. 12), is a CNN architecture designed
of models, which are known for their dense connections for efficient on-device vision applications, particularly on
between layers, making them highly efficient and accurate mobile and embedded devices. NASNet-Mobile is part of
for various computer vision tasks. The architecture of the Neural Architecture Search (NAS) family of models,
DenseNet-121 is shown in Fig. 10. which automates the process of architecture design by
DenseNet-121 is known for its dense connections within using reinforcement learning. It’s known for its high per-
a CNN architecture. Dense connections mean that each formance and efficiency.
element in a layer is connected to all elements in the pre- NASNet-Mobile is part of the Neural Architecture
ceding layer. The middle layers facilitate better feature Search (NAS) family of models designed for on-device
reuse and faster information flow. This model is recognized applications. It is known for its high performance and
for its efficiency and accuracy. efficiency. The middle layers are where this model con-
ducts an automated learning process to design features,
2.5.6 MobileNet model allowing for adaptability to different tasks.
MobileNet [45] is a family of neural network architectures 2.6 Ensemble learning algorithm
designed for efficient on-device vision applications, par-
ticularly on mobile and embedded devices. These models Ensemble learning algorithms are among the most suc-
are known for their compact size and low computational cessful approaches in prediction-based analytical studies.
requirements while maintaining reasonable accuracy in These algorithms consist of a model set coming together
tasks like image classification and object detection. Fig- for the resolution of a concrete problem. In a general sense,
ure 11 illustrates the network architecture of the MobileNet ensemble learning methods are the types of learning
model. methods offering higher accuracy and performance with
MobileNet is a family of network architectures designed the combination of more than one DL model prediction
for on-device applications, particularly on mobile and rather than one single deep learning method. It is possible
embedded devices. These models maintain reasonable to acquire predictions with higher performance from a DL
accuracy in tasks like image classification and object method by performing the training in more than one DL
123
10768 Neural Computing and Applications (2024) 36:10761–10776
method. The model is based on the production of a joint hypotheses and algorithms. Like different ensemble tech-
prediction with the combination of predictions acquired by niques, stacking aims to improve the accuracy of a model
the classifiers rather than the combination of classifiers by using the predictions of models that are not well-
themselves. In this method, the results of classifiers with grounded and by using these predictions as input for the
different accuracy rates are combined with different establishment of a better model.
methods (voting, average, etc.). Thus, it becomes possible
to get better results from one single classifier. Majority
vote, simple averaging, weight averaging, and stacking 3 Model visualisation
have been applied to establish an ensemble model in this
study. 3.1 Class-selective relevance map
Individual funding models and the acquired predictions
are presented as votes in the majority vote. The prediction Visualization technique based on CRM algorithm (Eq. 1)
with the maximum vote is accepted as the ultimate pre- has been used for individual models and ensembles to
diction (Fig. 13). In simple averaging, averages of found- localize landslide regions [49, 50].CRM visualization
ing model predictions are used to reach the ultimate algorithm computes the significance of activation in
prediction. deepest-convolution layer featured maps of a CNN model
Weight averaging is an extension of simple averages to emphasize the most distinctive ROI in the input image.
determined by different weights according to compound A prediction score Sc is computed for each c gradient in the
model predictions and classification performances. output layer. Another prediction score Sc ðl; mÞ is computed
Weights are multiplied by each prediction and later their for a spatial component ðl; mÞ after extracting it from the
averages are determined by the equation (w1 9 pred1 ? deepest convolution layer. The increasing average between
w2 9 pred2 ? w3 9 pred3)/3. All the maximum weight Sc and Sc ðl; mÞ computed from all the gradients in the
can be attributed to the individual model showing the best output layer of CNN models is identified as the linear sum
performance. The sum of w(i) has to be 1.0. of squares error.
Model stacking is a way to improve model predictions X
N
by combining the output of more than one model and Rðl; mÞ ¼ fðSc Sc ðl; mÞÞg2 ð1Þ
getting them worked by another machine learning model c¼1
named meta-learner [48]. Meta-learner tries to minimize Rðl; mÞ represents the CRM score calculated for a
the vulnerability of a model and optimize its robust aspects. specific location ðl; mÞ. The CRM score measures the sig-
Generally, the result is a robust model making a high level nificance of activation at this location in the deepest con-
of generalization based on invisible data. The stacking volutional layer’s feature maps. c is an index for the
workflow is shown in Fig. 14: gradients in the output layer of the CNN model. In other
In the figure above, it is seen that different samples are words, c iterates through a loop from 1 to N, where
not taken for the data training in the classifier training N represents the total number of gradients in the output
process. In this process, each classifier works indepen- layer. This score reflects the characteristics of activation at
dently, and this allows the classifiers to work in different this particular location in the deepest convolutional layer.
123
Neural Computing and Applications (2024) 36:10761–10776 10769
M1 M2 M3
Average of 1=63.0%
Average of 0=36.6%
1
Final
prediction
Model 1
Model 2
Output
Data Model 3 Meta-Learner
Model 4
Model N
Model Stack
123
10770 Neural Computing and Applications (2024) 36:10761–10776
Model Threshold:
1 0.1
Ensemble
CRM
Model Threshold: Averaging
2 0.1
Model Threshold:
3 0.1
123
Neural Computing and Applications (2024) 36:10761–10776 10771
Accuracy, Recall, F Measure, and Matthews correlation image classes are listed in Table 4. Visualization analyses
coefficient metric values. have been performed by the features extracted from these
Ensemble formation of predictions of the best seven layers.
CNN models for the classification of landslide categories The localization performance of CRMs established from
has been realized with the use of majority voting, simple each of the seven CNN models with the best performance
averaging, weighted averaging, and stacking methods. in the detection of landslide areas with the use of a land-
Table 3 shows the performance metrics of different slide class test set has been evaluated. Table 5 shows IoU
ensemble model groups acquired by the different ensemble and mAP scores acquired by the averaging of individual
model formation strategies. Based on the results presented IoU and mAPs computed from the 154 landslide images in
in Table 3, it can be seen that weighed averaging shows a the landslide test set having the information ground-truth
better performance than other ensemble formation strate- binding box. Here, mAP is computed within the range [0.1
gies. In the weighted averaging strategy, more weight is 0.6], averaging upon the ten IoU threshold value. The
attributed to models with a more accurate performance to equation providing the score of mAP, a metrical system
acquire higher prediction rates. Thus, in light of the results designed for the evaluation of performance criteria such as
from the analyses, it can be said that the NASNet-mobile precision, sensitivity, and F1 measure in one single point, is
model shows higher performance compared with the other given in Eq. (2):
models. As VGG-16 is the model with the lowest perfor- Z 1
mance, the weight for this model is computed as zero. mAP ¼ PðRÞdR ð2Þ
0
Consequently, ensemble formation has been realized with
the attribution of weights [0.05, 0.10, 0.19, 0.10, 0.19, 0.15, PðRÞ function represents precision as a function of R
0.25] respectively for Custom Model, VGG-19, Inception- (true positives). Precision is calculated by dividing the true
V3, Xception, DenseNet-121, MobileNet, and NASNet- positives by the total positive predictions. R represents the
mobile models. ratio of true positives and takes values in the range [0,1].
Feature extraction layers contributing to the acquisition of Table 4 CNN layers from the coarse models show superior perfor-
mance with the Landslide test set
high performance in predicting landslide and non-landslide
Model Feature extraction layer
Table 3 Performance metrics achieved with different model ensem- Custom CNN Conv3
ble strategies
VGG-16 Block5-conv3
Method Accuracy Precision Recall F measure VGG-19 Block4-pool
Majority voting 0.9221 1.0000 0.8442 0.9155 Inception-V3 Mixed3
Averaging 0.9221 1.0000 0.8442 0.9155 Xception Add-7
Weighted-averaging 0.9513 1.0000 0.9026 0.9488 DenseNet-121 Pool3-conv
Stacking 0.9253 1.0000 0.8506 0.9193 MobileNet Conv-pw-6-relu
NASNet-mobile Activation-129
Bold values show superior performance
123
10772 Neural Computing and Applications (2024) 36:10761–10776
Table 5 Average IoU, mAP, and threshold values for landslide class test set
Metrics Custom CNN VGG-19 Xception Inception-V3 NASNet-mobile MobileNet DenseNet-121
This value indicates how the object detection performance images. While IoUs for each model are observed to have
changes at a specific threshold value. low scores, the IoUs acquired by the CRM ensemble are
Moreover, Table 5 also shows the threshold values in understood to show more outstanding performance. Stated
which the best IoU and mAP values are acquired for each more clearly, it has been seen that bounding boxes used for
model: the detection of a predictable landslide area and the
Then, IoU and mAP scores for three ensemble CRMs bounding boxes representing real landslide comply with
named Ensemble-3, Ensemble-5, and Ensemble-7 have each other and that they have more improved IoU scores
been computed. These ensemble CRMs have been formed than individual CRMs. Consequently, it has been under-
by averaging CRMs acquired by the first three, five, and stood that the ensemble approach can be used not only for
seven best-performing CNN models selected according to classification performances but for the improvement of
IoU and mAP scores, as shown in Table 5. Models in overall object perception performance as well.
different ensemble CRMs are as follows: (a) Ensemble-3
(VGG-19, Xception, DenseNet-121) (b) Ensemble-5
(VGG-19, Xception, DenseNet-121, Inception-V3, NAS- 5 Discussion
Net-mobile) and (c) Ensemble-7 (Custom CNN, VGG-19,
Xception, DenseNet-121, Inception-V3, NASNet-mobile, As large-scale datasets with a group of data distribution
MobileNet). In Table 6, ensemble CRMs have provided and pre-trained models have already learned classification
much higher IoU and mAP scores than the individual skills, it is not surprising at all to observe better perfor-
CRMs. In Table 6, bold values show superior perfor- mances with the addition of further landslide images to this
mance. Among the ensemble CRMs, Ensemble-5 has dataset. For this reason, the Custom CNN model has shown
shown outstanding performance for IoU and moderate a lower performance than the other models, excluding the
performance for mAP. This has shown that the combina- VGG-16 model. The functionality and better performance
tion of more than five CNN models does not improve levels of ensemble models, compared with individual
localization performance more and that it has proved suf- models, have been proved in this study aiming to focus on
ficient for this study. Figure 17 shows precision-recall the classification and localization of landslide images with
curves of ensemble CRMs whose mAP scores have been the use of pre-trained models. The addition of non-land-
computed. Figure 18 shows a CRM sample of the slide images to the training dataset in the phase of landslide
Ensemble-5 approach in localizing ROI taken for any localization contributes to the improvement in perfor-
landslide image from the landslide test set. mance. Compared with other models, the VGG-19, Xcep-
It is seen in Fig. 18 that ROIs in CRMs for two landslide tion, DenseNet-121, Inception-V3, NASNet-mobile models
classes acquired by the most successful five CNN models have shown outstanding performance.
have emphasized different areas. The IoUs shown here are The weighted average community building strategy
detected using bounding boxes restricting real landslide performed relatively better than other community format
strategies in terms of all performance criteria. The weigh-
Table 6 The average Intersection over Union (IoU), mean Average ted-averaging strategy achieved much better performance
Precision (mAP), and corresponding threshold values obtained for the than other community generation strategies by giving
ensemble models when tested with landslide class data higher weight to the NASNet-mobile, DenseNet-121 and
Metrics Ensemble-3 Ensemble-5 Ensemble-7 Inception-V3 models. In summary, the use of a weighted
sum with high weights for these three prediction models is
IoU 0.7826 0.8334 0.6204
a justified approach due to their consistent and strong
mAP@ (0.1 0.6) 0.3165 0.3906 0.2972 individual performances, their different architectures, and
Threshold 0.4 0.3 0.4 the desire to maintain an unbiased and robust ensemble for
landslide detection and localization.
123
Neural Computing and Applications (2024) 36:10761–10776 10773
Fig. 17 Precision-recall curves concerning different IoU thresholds for a Ensemble-3, b Ensemble-5, and c Ensemble-7 models
123
10774 Neural Computing and Applications (2024) 36:10761–10776
Threshold: 0.4
Real landslide
CNN
Threshold: 0.4
Threshold: 0.4
Threshold: 0.4
(a)
Threshold: 0.4
Threshold: 0.4
Real landslide
Threshold: 0.4
Threshold: 0.4
(b)
123
Neural Computing and Applications (2024) 36:10761–10776 10775
Funding: Open access funding provided by the Scientific and Tech- Proceedings—2016 31st Youth Academic annual conference of
nological Research Council of Türkiye (TÜBİTAK). No funding to Chinese Association of automation, YAC 2016 444–448. https://
declare. doi.org/10.1109/YAC.2016.7804935
10. Ghorbanzadeh O, Blaschke T, Gholamnia K et al (2019) Evalu-
Data availability Data is available at http://study.rsgis.whu.edu.cn/ ation of different machine learning methods and deep-learning
pages/download/. convolutional neural networks for landslide detection. Remote
Sens 11:196. https://doi.org/10.3390/RS11020196
Code availability https://github.com/sivaramakrishnan-rajaraman/ 11. Yu H, Ma Y, Wang L et al (2017) A landslide intelligent
Detection-and-visualization-of-abnormality-in-chest-radiographs- detection method based on CNN and RSG_R. In: 2017 IEEE
using-modality-specific-CNNs. international conference on mechatronics and automation
(ICMA). Institute of Electrical and Electronics Engineers Inc.,
pp 40–44
Declarations 12. Tang X, Tu Z, Wang Y et al (2022) Automatic detection of
coseismic landslides using a new transformer method. Remote
Sens (Basel) 14:2884. https://doi.org/10.3390/rs14122884
Conflicts of interest The authors declare that they have no conflicts of
13. Liu Y, Zhang W, Chen X et al (2021) Landslide detection of
interest.
high-resolution satellite images using asymmetric dual-channel
network. In: 2021 IEEE international geoscience and remote
Open Access This article is licensed under a Creative Commons
Attribution 4.0 International License, which permits use, sharing, sensing symposium IGARSS. Institute of Electrical and Elec-
tronics Engineers (IEEE), pp 4091–4094
adaptation, distribution and reproduction in any medium or format, as
14. Tanatipuknon A, Aimmanee P, Watanabe Y et al (2021) Study on
long as you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons licence, and indicate combining two faster R-CNN models for landslide detection with
if changes were made. The images or other third party material in this a classification decision tree to improve the detection perfor-
mance. J Disaster Res 16:588–595. https://doi.org/10.20965/JDR.
article are included in the article’s Creative Commons licence, unless
2021.P0588
indicated otherwise in a credit line to the material. If material is not
included in the article’s Creative Commons licence and your intended 15. Liu D, Li J, Fan F (2021) Classification of landslides on the
use is not permitted by statutory regulation or exceeds the permitted southeastern Tibet Plateau based on transfer learning and limited
labelled datasets. Remote Sens Lett 12:286–295. https://doi.org/
use, you will need to obtain permission directly from the copyright
10.1080/2150704X.2021.1890263
holder. To view a copy of this licence, visit http://creativecommons.
org/licenses/by/4.0/. 16. Ji S, Yu D, Shen C et al (2020) Landslide detection from an open
satellite imagery and digital elevation model dataset using
attention boosted convolutional neural networks. Landslides
17:1337–1352. https://doi.org/10.1007/S10346-020-01353-2/
References TABLES/9
17. Shi W, Zhang M, Ke H et al (2021) Landslide recognition by
1. Ma Z, Mei G, Piccialli F (2021) Machine learning for landslides deep convolutional neural network and change detection. IEEE
prevention: a survey. Neural Comput Appl 33:10881–10907. Trans Geosci Remote Sens 59:4654–4672. https://doi.org/10.
https://doi.org/10.1007/s00521-020-05529-8 1109/TGRS.2020.3015826
2. Chen Z, Zhang Y, Ouyang C et al (2018) Automated landslides 18. Lopes UK, Valiati JF (2017) Pre-trained convolutional neural
detection for mountain cities using multi-temporal remote sens- networks as feature extractors for tuberculosis detection. Comput
ing imagery. Sensors 18:821. https://doi.org/10.3390/S18030821 Biol Med 89:135–143. https://doi.org/10.1016/J.COMP
3. Dou J, Chang KT, Chen S et al (2015) Automatic Case-based BIOMED.2017.08.001
reasoning approach for landslide detection: integration of object- 19. Rajaraman S, Candemir S, Kim I et al (2018) Visualization and
oriented image analysis and a genetic algorithm. Remote Sens interpretation of convolutional neural network predictions in
7:4318–4342. https://doi.org/10.3390/RS70404318 detecting pneumonia in pediatric chest radiographs. Appl Sci
4. Tehrani FS, Calvello M, Liu Z et al (2022) Machine learning and 8:1715. https://doi.org/10.3390/APP8101715
landslide studies: recent advances and applications. Nat Hazards 20. Rajaraman S, Candemir S, Xue Z et al (2018) A novel stacked
114(2):1197–1245. https://doi.org/10.1007/S11069-022-05423-7 generalization of models for improved TB detection in chest
5. Cheng G, Guo L, Zhao T et al (2012) Automatic landslide radiographs. In: Annual international conference of the IEEE
detection from remote-sensing imagery using a scene classifica- engineering in medicine and biology society 2018:718–721.
tion method based on BoVW and pLSA. Int J Remote Sens https://doi.org/10.1109/EMBC.2018.8512337
34:45–59. https://doi.org/10.1080/01431161.2012.705443 21. Catani F (2021) Landslide detection by deep learning of non-
6. Danneels G, Pirard E, Havenith HB (2007) Automatic landslide nadiral and crowdsourced optical images. Landslides
detection from remote sensing images using supervised classifi- 18:1025–1044. https://doi.org/10.1007/s10346-020-01513-4
cation methods. In: International geoscience and remote sensing 22. Li H, He Y, Xu Q et al (2022) Detection and segmentation of
symposium (IGARSS), pp 3014–3017 loess landslides via satellite images: a two-phase framework.
7. Mezaal MR, Pradhan B, Sameen MI et al (2017) Optimized Landslides 19:673–686. https://doi.org/10.1007/s10346-021-
neural architecture for automatic landslide detection from high- 01789-0
resolution airborne laser scanning data. Appl Sci 7:730. https:// 23. Fu R, He J, Liu G et al (2022) Fast seismic landslide detection
doi.org/10.3390/APP7070730 based on improved mask R-CNN. Remote Sens (Basel) 14:3928.
8. Wang H, Zhang L, Yin K et al (2021) Landslide identification https://doi.org/10.3390/rs14163928
using machine learning. Geosci Front 12:351–364. https://doi. 24. Yang R, Zhang F, Xia J, Wu C (2022) Landslide extraction using
org/10.1016/j.gsf.2020.02.012 mask R-CNN with background-enhancement method. Remote
9. Ding A, Zhang Q, Zhou X, Dai B (2017) Automatic recognition Sens (Basel) 14:2206. https://doi.org/10.3390/rs14092206
of landslide based on CNN and texture change detection. In: 25. Hacıefendioğlu K, Demir G, Başağa HB (2021) Landslide
detection using visualization techniques for deep convolutional
123
10776 Neural Computing and Applications (2024) 36:10761–10776
neural network models. Nat Hazards 109:329–350. https://doi. 39. Močkus J (1974) Optimization techniques IFIP technical con-
org/10.1007/S11069-021-04838-Y/FIGURES/12 ference Novosibirsk. In: Marchuk GI (ed) IFIP technical con-
26. Ofli F, Imran M, Qazi U et al (2023) Landslide detection in real- ference on optimization techniques. Springer, Heidelberg,
time social media image streams. Neural Comput Appl pp 400–404
35:17809–17819. https://doi.org/10.1007/s00521-023-08648-0 40. Bergstra J, Bengio Y (2012) Random search for hyper-parameter
27. Li D, Tang X, Tu Z et al (2023) Automatic detection of forested optimization Yoshua Bengio. J Mach Learn Res 13:281–305
landslides: a case study in Jiuzhaigou County, China. Remote 41. Johnson JM, Khoshgoftaar TM (2019) Survey on deep learning
Sens (Basel) 15:3850. https://doi.org/10.3390/rs15153850 with class imbalance. J Big Data 6:1–54. https://doi.org/10.1186/
28. Janarthanan SS, Subbian D, Subbarayan S et al (2023) SFCNet: S40537-019-0192-5/TABLES/18
deep learning-based lightweight separable factorized convolution 42. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for
network for landslide detection. J Indian Soc Remote Sens image recognition. In: Proceedings of the IEEE computer society
51:1157–1170. https://doi.org/10.1007/s12524-023-01685-1 conference on computer vision and pattern recognition
29. Saba SB, Ali M, Turab SA et al (2023) Comparison of pixel, sub- 2016-Decem: 770–778. https://doi.org/10.1109/CVPR.2016.90
pixel and object-based image analysis techniques for co-seismic 43. Chollet F (2017) Xception: deep learning with depthwise sepa-
landslides detection in seismically active area in Lesser Hima- rable convolutions. In: Proceedings—30th IEEE conference on
laya, Pakistan. Nat Hazards 115:2383–2398. https://doi.org/10. computer vision and pattern recognition, CVPR 2017
1007/s11069-022-05642-y 2017-Janua: 1800–1807. https://doi.org/10.1109/CVPR.2017.195
30. Ma Z, Mei G (2021) Deep learning for geological hazards anal- 44. Huang G, Liu Z, van der Maaten L, Weinberger KQ (2017)
ysis: data, models, applications, and opportunities. Earth Sci Rev Densely Connected Convolutional Networks. arXiv:1608.06993
223:103858. https://doi.org/10.1016/j.earscirev.2021.103858 [cs.CV], pp 1–9
31. Simonyan K, Zisserman A (2015) Very deep convolutional net- 45. Howard AG, Zhu M, Chen B, et al (2017) MobileNets: efficient
works for large-scale image recognition. In: 3rd International convolutional neural networks for mobile vision applications.
conference on learning representations, ICLR 2015—conference arxiv:1704.04861, pp 1–9
track proceedings, pp 1–14 46. Zoph B, Vasudevan V, Shlens J Le QV (2017) Learning trans-
32. Szegedy C, Vanhoucke V, Ioffe S et al (2016) Rethinking the ferable architectures for scalable image recognition.
inception architecture for computer vision. In: Proceedings of the arXiv:1707.07012
IEEE computer society conference on computer vision and pat- 47. Krizhevsky BA, Sutskever I, Hinton GE (2017) ImageNet clas-
tern recognition. IEEE Computer Society, pp 2818–2826 sification with deep convolutional neural networks. Commun
33. Chollet F (2017) Xception: deep learning with depthwise sepa- ACM 60:84–90
rable convolutions. In: Proceedings—30th IEEE conference on 48. Dietterich TG (2000) Ensemble methods in machine learning. In:
computer vision and pattern recognition, CVPR 2017. Institute of International workshop on multiple classifier systems, MCS
Electrical and Electronics Engineers Inc., pp 1800–1807 2000. Springer, Berlin, Heidelberg, pp 1–15
34. Sandler M, Howard A, Zhu M et al (2018) MobileNetV2: 49. Kim I, Rajaraman S, Antani S (2019) Visual interpretation of
Inverted residuals and linear bottlenecks. In: Proceedings of the convolutional neural network predictions in classifying medical
IEEE computer society conference on computer vision and pat- image modalities. Diagnostics (Basel). https://doi.org/10.3390/
tern recognition, pp 4510–4520 DIAGNOSTICS9020038
35. Pham H, Guan MY, Zoph B et al (2018) Efficient neural archi- 50. Mozer MC, Smolensky P (1989) Using relevance to reduce net-
tecture search via parameters sharing. In: Proceedings of the 35th work size automatically. Conn Sci 1:3–16. https://doi.org/10.
international conference on machine learning. PMLR, 1080/09540098908915626
pp 4095–4104 51. Everingham M, Eslami SMA, Van Gool L et al (2015) The Pascal
36. Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional Visual object classes challenge: a retrospective. Int J Comput Vis
networks for biomedical image segmentation. Lecture Notes in 111:98–136. https://doi.org/10.1007/S11263-014-0733-5/FIG
Computer Science (including subseries Lecture Notes in Artificial URES/27
Intelligence and Lecture Notes in Bioinformatics) 9351:234–241 52. Lin T-Y, Maire M, Belongie S et al (2014) Microsoft COCO:
37. Srivastava N, Hinton G, Krizhevsky A et al (2014) Dropout: a common objects in context. In: European conference on computer
simple way to prevent neural networks from overfitting. J Mach vision, pp 740–755
Learn Res 15:1929–1958
38. Rajaraman S, Kim I, Antani SK (2020) Detection and visual- Publisher’s Note Springer Nature remains neutral with regard to
ization of abnormality in chest radiographs using modality- jurisdictional claims in published maps and institutional affiliations.
specific convolutional neural network ensembles. PeerJ
2020:e8693. https://doi.org/10.7717/PEERJ.8693/FIG-11
123