Popis: |
Landslide susceptibility modeling, an essential approach to mitigate natural disasters, has witnessed considerable improvement following advances in machine learning (ML) techniques. However, in most of the previous studies, the distribution of input data was assumed as being, and treated, as normal or Gaussian; this assumption is not always valid as ML is heavily dependent on the quality of the input data. Therefore, we examine the effectiveness of six feature transformations (minimax normalization (Std-X), logarithmic functions (Log-X), reciprocal function (Rec-X), power functions (Power-X), optimal features (Opt-X), and one-hot encoding (Ohe-X) over the 11conditioning factors (i.e., altitude, slope, aspect, curvature, distance to road, distance to lineament, distance to stream, terrain roughness index (TRI), normalized difference vegetation index (NDVI), land use, and vegetation density). We selected the frequent landslide-prone area in the Cameron Highlands in Malaysia as a case study to test this novel approach. These transformations were then assessed by three benchmark ML methods, namely extreme gradient boosting (XGB), logistic regression (LR), and artificial neural networks (ANN). The 10-fold cross-validation method was used for model evaluations. Our results suggest that using Ohe-X transformation over the ANN model considerably improved performance from 52.244 to 89.398 (37.154% improvement). |