Hybrid feature selection-based machine learning Classification system for the prediction of injury severity in single and multiple-vehicle accidents
Autor: | Shuguang Zhang, Afaq Khattak, Caroline Mongina Matara, Arshad Hussain, Asim Farooq |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2022 |
Předmět: |
Computer and Information Sciences
Asia Epidemiology Science Highways Transportation Research and Analysis Methods Severity of Illness Index Civil Engineering Machine Learning Geographical Locations Machine Learning Algorithms Artificial Intelligence Medicine and Health Sciences Humans Pakistan Public and Occupational Health Multidisciplinary Applied Mathematics Simulation and Modeling Traumatic Injury Risk Factors Accidents Traffic Traffic Safety Bayes Theorem Transportation Infrastructure Roads Logistic Models ROC Curve Area Under Curve Medical Risk Factors Road Traffic Collisions Physical Sciences People and Places Wounds and Injuries Engineering and Technology Medicine Safety Mathematics Algorithms Research Article |
Zdroj: | PLoS ONE, Vol 17, Iss 2, p e0262941 (2022) PLoS ONE |
ISSN: | 1932-6203 |
Popis: | To undertake a reliable analysis of injury severity in road traffic accidents, a complete understanding of important attributes is essential. As a result of the shift from traditional statistical parametric procedures to computer-aided methods, machine learning approaches have become an important aspect in predicting the severity of road traffic injuries. The paper presents a hybrid feature selection-based machine learning classification approach for detecting significant attributes and predicting injury severity in single and multiple-vehicle accidents. To begin, we employed a Random Forests (RF) classifier in conjunction with an intrinsic wrapper-based feature selection approach called the Boruta Algorithm (BA) to find the relevant important attributes that determine injury severity. The influential attributes were then fed into a set of four classifiers to accurately predict injury severity (Naive Bayes (NB), K-Nearest Neighbor (K-NN), Binary Logistic Regression (BLR), and Extreme Gradient Boosting (XGBoost)). According to BA’s experimental investigation, the vehicle type was the most influential factor, followed by the month of the year, the driver’s age, and the alignment of the road segment. The driver’s gender, the presence of a median, and the presence of a shoulder were all found to be unimportant. According to classifier performance measures, XGBoost surpasses the other classifiers in terms of prediction performance. Using the specified attributes, the accuracy, Cohen’s Kappa, F1-Measure, and AUC-ROC values of the XGBoost were 82.10%, 0.607, 0.776, and 0.880 for single vehicle accidents and 79.52%, 0.569, 0.752, and 0.86 for multiple-vehicle accidents, respectively. |
Databáze: | OpenAIRE |
Externí odkaz: | |
Nepřihlášeným uživatelům se plný text nezobrazuje | K zobrazení výsledku je třeba se přihlásit. |