Abstract
In this study, we have evaluated and compared prediction capability of Bagging Ensemble Based Alternating Decision Trees (BADT), Logistic Regression (LR), and J48 Decision Trees (J48DT) for landslide susceptibility mapping at part of the Uttarakhand State (India). The BADT method has been proposed in the present study which is a novel hybrid machine learning ensemble approach of bagging ensemble and alternating decision trees. The J48DT is a relative new machine learning technique which has been applied only in few landslide studies, and the LR is known as a popular landslide susceptibility model. For the model studies, a spatial database of 930 historical landslide events and 15 landslide affecting factors have been collected and analyzed. This database has been used to build and validate the landslide models namely BADT, LR and J48DT Predictive capability of these models has been validated and compared using statistical analyzing methods and Receiver Operating Characteristic (ROC) curve. Results show that these three landslide models (BADT, LR and J48DT) performed well with the training dataset. However, using the validation dataset the BADT model has the highest prediction capability, followed by the LR model, and the J48DT model, respectively. This indicates that the BADT is a promising method which can be used for landslide susceptibility assessment also for other landslide prone areas.
Similar content being viewed by others
References
Ayalew L, Yamagishi H (2005) The application of GIS-based logistic regression for landslide susceptibility mapping in the Kakuda–Yahiko Mountains Central Japan. Geomorphology 65:15–31
Ayalew L, Yamagishi H, Ugawa N (2004) Landslide susceptibility mapping using GIS-based weighted linear combination, the case in Tsugawa area of Agano River Niigata Prefecture, Japan. Landslides 1:73–81
Bai S-B, Wang J, Lü G-N, Zhou P-G, Hou S-S, Xu S-N (2010) GIS-based logistic regression for landslide susceptibility mapping of the Zhongxian segment in the three Gorges area China. Geomorphology 115:23–31
Breiman L (1996) Bagging predictors. Mach Learn 24:123–140
Chauhan S, Sharma M, Arora M, Gupta N (2010) Landslide susceptibility zonation through ratings derived from artificial neural network. Int J Appl Earth Obs Geoinf 12:340–350
Choi J, Oh H-J, Won J-S, Lee S (2010) Validation of an artificial neural network model for landslide susceptibility mapping. Environ Earth Sci 60:473–483
Doshi M, Chaturvedi SK (2014) Correlation based feature selection (CFS) technique to predict student performance. Int J Comput Netw Commun (UCNC) 6:197
Dreiseitl S, Ohno-Machado L (2002) Logistic regression and artificial neural network classification models: a methodology review. J Biomed Inform 35:352–359
Ercanoglu M, Gokceoglu C, Van Asch TW (2004) Landslide susceptibility zoning north of Yenice (NW Turkey) by multivariate statistical techniques. Nat Hazard 32:1–23
Francis J, Tontisirin N, Anantsuksomsri S, Vink J, Zhong V (2015) Alternative strategies for mapping ACS Estimates and error of estimation. In: Hoque N, Potter LB (eds) Emerging techniques in applied demography. Springer, Netherlands, pp 247–273. doi:10.1007/978-94-017-8990-5_16
Freund Y, Mason L (1999) The alternating decision tree learning algorithm. In: icml. pp 124–133
Frye C (2007) About the geometrical interval classification method. https://blogs.esri.com/esri/arcgis/2007/10/18/about-the-geometrical-interval-classification-method/
Gomez H, Kavzoglu T (2005) Assessment of shallow landslide susceptibility using artificial neural networks in Jabonosa River Basin, Venezuela. Eng Geol 78:11–27
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
Hong Y, Adler R, Huffman G (2007) Use of satellite remote sensing data in the mapping of global landslide susceptibility. Nat Hazard 43:245–256
Hungr O, Leroueil S, Picarelli L (2014) The Varnes classification of landslide types, an update. Landslides 11:167–194
Islam M, Chattoraj S, Ray CP (2014) Ukhimath landslide 2012 at Uttarakhand, India: causes and consequences. Int J Geomat Geosci 4:544
Kantardzic M (2011) Data mining: concepts, models, methods, and algorithms. Wiley, New York
Kavzoglu T, Sahin EK, Colkesen I (2014) Landslide susceptibility mapping using GIS-based multi-criteria decision analysis, support vector machines, and logistic regression. Landslides 11:425–439. doi:10.1007/s10346-013-0391-7
Kundu S, Sharma DC, Saha AK, Pant CC, Mathew J (2011) GIS-based statistical landslide susceptibility zonation: a case study in Ganeshganga Watershed, The Himalayas. Paper presented at the 12th Esri India User
Maldonado S, Weber R (2009) A wrapper method for feature selection using support vector machines. Inf Sci 179:2208–2217
Marjanović M, Kovačević M, Bajat B, Voženílek V (2011) Landslide susceptibility assessment using SVM machine learning algorithm. Eng Geol 123:225–234
Mathew J, Jha V, Rawat G (2009) Landslide susceptibility zonation mapping and its validation in part of Garhwal Lesser Himalaya, India, using binary logistic regression analysis and receiver operating characteristic curve method. Landslides 6:17–26
Mladenić D, Brank J, Grobelnik M, Milic-Frayling N (2004) Feature selection using linear classifier weights: interaction with classification models. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 234–241
NCEP (2014) Global weather data for SWAT. https://globalweather.tamu.edu
Ohlmacher GC, Davis JC (2003) Using multiple logistic regression and GIS technology to predict landslide hazard in northeast Kansas, USA. Eng Geol 69:331–343
Olson DL, Delen D, Meng Y (2012) Comparative analysis of data mining methods for bankruptcy prediction. Decis Support Syst 52:464–473
Ozdemir A (2011) GIS-based groundwater spring potential mapping in the Sultan Mountains (Konya, Turkey) using frequency ratio, weights of evidence and logistic regression methods and their comparison. J Hydrol 411:290–308
Pedersen T (2001) A decision tree of bigrams is an accurate predictor of word sense. In: Proceedings of the second meeting of the North American chapter of the association for computational linguistics on language technologies. Association for computational linguistics, pp 1–8
Petley D (2012) Global patterns of loss of life from landslides. Geology 40:927–930
Pham BT, Tien Bui D, Indra P, Dholakia MB (2015a) Landslide susceptibility assessment at a part of Uttarakhand Himalaya, India using GIS—based statistical approach of frequency ratio method. Int J Eng Res Technol 4:338–344. doi:10.17577/IJERTV4IS110285
Pham BT, Tien Bui D, Pourghasemi HR, Indra P, Dholakia MB (2015b) Landslide susceptibility assesssment in the Uttarakhand area (India) using GIS: a comparison study of prediction capability of naïve bayes multilayer perceptron neural networks and functional trees methods. Theor Appl Climatol 122:1–19. doi:10.1007/s00704-015-1702-9
Pham BT, Bui DT, Dholakia MB, Prakash I, Pham HV, Mehmood K, Le HQ (2016a) A novel ensemble classifier of rotation forest and Naïve Bayer for landslide susceptibility assessment at the Luc Yen district, Yen Bai Province (Viet Nam) using GIS. Geomat Nat Hazard Risk:1–23. doi:10.1080/19475705.2016.1255667
Pham BT, Bui DT, Prakash I, Dholakia M (2016b) Evaluation of predictive ability of support vector machines and naive Bayes trees methods for spatial prediction of landslides in Uttarakhand state (India) using GIS. J Geomat 10:71–79
Pham BT, Nguyen MD, Le AH (2016c) Shear resistance and stability study of embankments using different shear resistance parameters of soft soils from laboratory and field tests: a case study of Hai Phong city, Viet Nam. Int J Sci Res Dev 3:330–334
Pham BT, Pradhan B, Tien Bui D, Prakash I, Dholakia MB (2016d) A comparative study of different machine learning methods for landslide susceptibility assessment: a case study of Uttarakhand area (India). Environ Model Softw 84:240–250. doi:10.1016/j.envsoft.2016.07.005
Pham BT, Tien Bui D, Dholakia MB, Prakash I, Pham HV (2016e) A comparative study of least square support vector machines and multiclass alternating decision trees for spatial prediction of rainfall-induced landslides in a tropical cyclones area. Geotech Geol Eng 34:1–18. doi:10.1007/s10706-016-9990-0
Pham BT, Tien Bui D, Prakash I, Dholakia MB (2016f) Rotation forest fuzzy rule-based classifier ensemble for spatial prediction of landslides using GIS. Nat Hazard 83:1–31. doi:10.1007/s11069-016-2304-2
Pham BT, Tien Bui D, Pham HV (2016f) Spatial prediction of rainfall induced landslides using Bayesian Network at Luc Yen District, Yen Bai Province (Viet Nam). In: International conference on environmental issues in mining and natural resources development (EMNR 2016), Hanoi University of mining and geology (HUMG), Viet Nam, pp 1–10
Pham BT, Tien Bui D, Pham HV, Le HQ, Prakash I, Dholakia MB (2016g) Landslide hazard assessment using random subspace fuzzy rules based classifier ensemble and probability analysis of rainfall data: a case study at Mu Cang Chai District, Yen Bai Province (Viet Nam). J Ind Soc Remote Sens:1–11. doi:10.1007/s12524-016-0620-3
Pham BT, Tien Bui D, Prakash I, Dholakia MB (2017) Hybrid integration of Multilayer Perceptron Neural Networks and machine learning ensembles for landslide susceptibility assessment at Himalayan area (India) using GIS. Catena Part 1 149:52–63. doi:10.1016/j.catena.2016.09.007
Pourghasemi HR, Pradhan B, Gokceoglu C (2012) Application of fuzzy logic and analytical hierarchy process (AHP) to landslide susceptibility mapping at Haraz watershed Iran. Nat Hazard 63:965–996
Pradhan B (2011) Use of GIS-based fuzzy logic relations and its cross application to produce landslide susceptibility maps in three test areas in Malaysia. Environ Earth Sci 63:329–349. doi:10.1007/s12665-010-0705-1
Pradhan B, Lee S (2010) Delineation of landslide hazard areas on Penang Island, Malaysia, by using frequency ratio, logistic regression, and artificial neural network models. Environ Earth Sci 60:1037–1054
Pradhan B, Sezer EA, Gokceoglu C, Buchroithner FM (2010) Landslide susceptibility mapping by neuro-fuzzy approach in a landslide-prone area (Cameron Highlands, Malaysia) transactions on geoscience and remote sensing
Quinlan JR (1996) Bagging, boosting, and C4. 5. In: AAAI/IAAI, vol. 1, pp 725–730
Saboya F Jr, da Glória Alves M, Dias Pinto W (2006) Assessment of failure susceptibility of soil slopes using fuzzy logic. Eng Geol 86:211–224
Schapire RE, Freund Y, Bartlett P, Lee WS (1998) Boosting the margin: a new explanation for the effectiveness of voting methods. Ann Stat 26:1651–1686
Schuster RL (1996) Socioeconomic significance of landslides. Landslides: Investigation and Mitigation Washington (DC): National Academy Press Transportation Research Board Special Report 247:12–35
Shirzadi A et al (2017) Shallow landslide susceptibility assessment using a novel hybrid intelligence approach. Environ Earth Sci 76:60
Tien Bui D, Pradhan B, Lofman O, Revhaug I (2012a) Landslide susceptibility assessment in vietnam using support vector machines decision tree, and Naive Bayes models. Math Probl Eng 2012:1–26. doi:10.1155/2012/974638
Tien Bui D, Pradhan B, Lofman O, Revhaug I, Dick OB (2012b) Landslide susceptibility mapping at Hoa Binh province (Vietnam) using an adaptive neuro-fuzzy inference system and GIS. Comput Geosci 45:199–211. doi:10.1016/j.cageo.2011.10.031
Tien Bui D, Tien Ho C, Revhaug I, Pradhan B, Duy Nguyen B (2014) Landslide susceptibility mapping along the national road 32 of Vietnam using GIS-based j48 decision tree classifier and its ensembles. In: Buchroithner M, Prechtel N, Burghardt D (eds) Cartography from pole to pole. Springer, Berlin, pp 303–317. doi:10.1007/978-3-642-32618-9_22
Tien Bui D, Tuan TA, Klempe H, Pradhan B, Revhaug I (2015) Spatial prediction models for shallow landslide hazards: a comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree. Landslides 13:361–378. doi:10.1007/s10346-015-0557-6
Tien Bui D, Ho T-C, Pradhan B, Pham B-T, Nhu V-H, Revhaug I (2016a) GIS-based modeling of rainfall-induced landslides using data mining-based functional trees classifier with AdaBoost Bagging, and MultiBoost ensemble frameworks. Environ Earth Sci 75:1–22. doi:10.1007/s12665-016-5919-4
Tien Bui D, Pham BT, Nguyen QP, Hoang N-D (2016b) Spatial prediction of rainfall-induced shallow landslides using hybrid integration approach of least-squares support vector machines and differential evolution optimization: a case study in Central Vietnam. Int J Digit Earth 9:1–21. doi:10.1080/17538947.2016.1169561
Tsangaratos P, Benardos A (2014) Estimating landslide susceptibility through a artificial neural network classifier. Nat Hazard 74:1489–1516
Tsangaratos P, Ilia I (2015) Landslide susceptibility mapping using a modified decision tree classifier in the Xanthi Perfection, Greece. Landslides 13:305–320. doi:10.1007/s10346-015-0565-6
Tu JV (1996) Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes. J Clin Epidemiol 49:1225–1231
Vahidnia MH, Alesheikh AA, Alimohammadi A, Hosseinali F (2010) A GIS-based neuro-fuzzy procedure for integrating knowledge and data in landslide susceptibility mapping. Comput Geosci 36:1101–1114. doi:10.1016/j.cageo.2010.04.004
Van Den Eeckhaut M, Vanwalleghem T, Poesen J, Govers G, Verstraeten G, Vandekerckhove L (2006) Prediction of landslide susceptibility using rare events logistic regression: a case-study in the Flemish Ardennes (Belgium). Geomorphology 76:392–410. doi:10.1016/j.geomorph.2005.12.003
Varnes DJ (1984) Landslide hazard zonation: a review of principles and practice. UNESCO Press, Paris
Xu C, Dai F, Xu X, Lee YH (2012) GIS-based support vector machine modeling of earthquake-triggered landslide susceptibility in the Jianjiang River watershed, China. Geomorphology 145:70–80
Zhao Y, Zhang Y (2008) Comparison of decision tree methods for finding active objects. Adv Space Res 41:1955–1959
Acknowledgements
Authors are thankful to Dr MB Dholakia, LD College of Engineering, Gujarat, India for his encouragements. Authors also thank to the Director, Bhaskarcharya Institute for Space Applications and Geo-Informatics (BISAG), Department of Science and Technology, Government of Gujarat, Gandhinagar, Gujarat, India for providing facilities to carry out this research work.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Pham, B.T., Tien Bui, D. & Prakash, I. Landslide Susceptibility Assessment Using Bagging Ensemble Based Alternating Decision Trees, Logistic Regression and J48 Decision Trees Methods: A Comparative Study. Geotech Geol Eng 35, 2597–2611 (2017). https://doi.org/10.1007/s10706-017-0264-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10706-017-0264-2