Multiclass stand-alone and ensemble machine learning algorithms utilised to classify soils based on their physico-chemical characteristics

Autor: Eyo Eyo, Samuel Abbey
Jazyk: angličtina
Rok vydání: 2022
Předmět:
Zdroj: Journal of Rock Mechanics and Geotechnical Engineering, Vol 14, Iss 2, Pp 603-615 (2022)
Druh dokumentu: article
ISSN: 1674-7755
DOI: 10.1016/j.jrmge.2021.08.011
Popis: This study has provided an approach to classify soil using machine learning. Multiclass elements of stand-alone machine learning algorithms (i.e. logistic regression (LR) and artificial neural network (ANN)), decision tree ensembles (i.e. decision forest (DF) and decision jungle (DJ)), and meta-ensemble models (i.e. stacking ensemble (SE) and voting ensemble (VE)) were used to classify soils based on their intrinsic physico-chemical properties. Also, the multiclass prediction was carried out across multiple cross-validation (CV) methods, i.e. train validation split (TVS), k-fold cross-validation (KFCV), and Monte Carlo cross-validation (MCCV). Results indicated that the soils' clay fraction (CF) had the most influence on the multiclass prediction of natural soils' plasticity while specific surface and carbonate content (CC) possessed the least within the nature of the dataset used in this study. Stand-alone machine learning models (LR and ANN) produced relatively less accurate predictive performance (accuracy of 0.45, average precision of 0.5, and average recall of 0.44) compared to tree-based models (accuracy of 0.68, average precision of 0.71, and recall rate of 0.68), while the meta-ensembles (SE and VE) outperformed (accuracy of 0.75, average precision of 0.74, and average recall rate of 0.72) all the models utilised for multiclass classification. Sensitivity analysis of the meta-ensembles proved their capacities to discriminate between soil classes across the methods of CV considered. Machine learning training and validation using MCCV and KFCV methods enabled better prediction while also ensuring that the dataset was not overfitted by the machine learning models. Further confirmation of this phenomenon was depicted by the continuous rise of the cumulative lift curve (LC) of the best performing models when using the MCCV technique. Overall, this study demonstrated that soil's physico-chemical properties do have a direct influence on plastic behaviour and, therefore, can be relied upon to classify soils.
Databáze: Directory of Open Access Journals