Machine Learning Methods to Analyze Injury Severity of Drivers from Different Age and Gender Groups

Autor: Ryan Doczy, Somayeh Mafi, Yassir AbdelRazig
Rok vydání: 2018
Předmět:
Zdroj: Transportation Research Record: Journal of the Transportation Research Board. 2672:171-183
ISSN: 2169-4052
0361-1981
DOI: 10.1177/0361198118794292
Popis: Access to non-biased and accurate models capable of predicting driver injury severity of collision events is vital for determining what safety measures should be implemented at intersections. Inadequate models can underestimate the potential for collision events to result in driver fatalities or injuries, which can lead to improperly assessing the safety criteria of an intersection. This study investigates how injury severity differs between drivers of various ages and gender groups using cost-sensitive data-mining models. Previous research efforts have used machine learning methods for predicting injury severity; however, these studies did not consider the consequences (cost) of incorrect predictions. This paper addresses this shortfall by considering the monetary cost of incorrect injury severity predictions when developing C4.5, instance-based (IB), and random forest (RF) machine-learning models. One model of each method was developed for four distinct cohorts of drivers (i.e., younger males, younger females, older males, and older females). Each model considered a selection of driver, vehicular, road/traffic, environmental, and crash parameters for determining if they significantly influenced driver injury severity. A five-year period of two-vehicle crash data collected at signalized intersections in the metropolitan area of Miami, Florida was used in the models. Results indicated that cost-sensitive learning classifiers were superior to regular classifiers at accurately predicting injuries and fatalities of crashes. Among cost-sensitive models, RF outperformed C4.5 and IB models in predicting driver injury severity for four groups of drivers. The models displayed substantial differences in injury severity determinants across the age/gender cohorts.
Databáze: OpenAIRE