Bagging Vs. Boosting in Ensemble Machine Learning? An Integrated Application to Fraud Risk Analysis in the Insurance Sector

Autor: Ruixing Ming, Osama Mohamad, Nisreen Innab, Mohamed Hanafy
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: Applied Artificial Intelligence, Vol 38, Iss 1 (2024)
Druh dokumentu: article
ISSN: 08839514
1087-6545
0883-9514
DOI: 10.1080/08839514.2024.2355024
Popis: Addressing the pressing challenge of insurance fraud, which significantly impacts financial losses and trust within the insurance industry, this study introduces an innovative automated detection system utilizing ensemble machine learning (EML) algorithms. The approach encompasses four strategic phases: 1) Tackling data imbalance through diverse re-sampling methods (Over-sampling, Under-sampling, and Hybrid); 2) Optimizing feature selection (Filtering, Wrapping, and Embedding) to enhance model accuracy; 3) employing binary classification techniques (Bagging and Boosting) for effective fraud identification; and 4) applying explanatory model analysis (Shapley Additive Explanations, Break-down plot, and variable-importance Measure) to evaluate the influence of individual features on model performance. Our comprehensive analysis reveals that while not every re-sampling technique improves model performance, all feature selection methods markedly bolster predictive accuracy. Notably, the combination of the Gradient Boosting Machine (GBM) algorithm with NCR re-sampling and GBMVI feature selection emerges as the most effective configuration, offering superior fraud detection capabilities. This study not only advances the theoretical framework for combating insurance fraud through AI but also provides a practical blueprint for insurance companies aiming to incorporate advanced AI strategies into their fraud detection arsenals, thereby mitigating financial risks and fostering trust systems.
Databáze: Directory of Open Access Journals
Nepřihlášeným uživatelům se plný text nezobrazuje