A New Spatial Count Data Model with Bayesian Additive Regression Trees for Accident Hot Spot Identification

Autor: Prasad Buddhavarapu, Rico Krueger, Prateek Bansal
Jazyk: angličtina
Rok vydání: 2020
Předmět:
site ranking
FOS: Computer and information sciences
Safety Management
negative binomial regression
Computer science
statistical-analysis
Bayesian probability
unobserved heterogeneity
Negative binomial distribution
Inference
Human Factors and Ergonomics
random-parameters
spatial count data modelling
computer.software_genre
polya-gamma data augmentation
Statistics - Applications
bayesian additive regression trees
Goodness of fit
0502 economics and business
Humans
0501 psychology and cognitive sciences
support vector machine
Applications (stat.AP)
negative binomial model
empirical bayes
Built Environment
Safety
Risk
Reliability and Quality

050107 human factors
050210 logistics & transportation
Spatial Analysis
Models
Statistical

05 social sciences
Public Health
Environmental and Occupational Health

Probabilistic logic
Accidents
Traffic

crash-frequency
neural-network
Bayes Theorem
prediction
transportation safety
Support vector machine
accident analysis
Ranking
Data mining
Safety
computer
Count data
Popis: The identification of accident hot spots is a central task of road safety management. Bayesian count data models have emerged as the workhorse method for producing probabilistic rankings of hazardous sites in road networks. Typically, these methods assume simple linear link function specifications, which, however, limit the predictive power of a model. Furthermore, extensive specification searches are precluded by complex model structures arising from the need to account for unobserved heterogeneity and spatial correlations. Modern machine learning (ML) methods offer ways to automate the specification of the link function. However, these methods do not capture estimation uncertainty, and it is also difficult to incorporate spatial correlations. In light of these gaps in the literature, this paper proposes a new spatial negative binomial model, which uses Bayesian additive regression trees to endogenously select the specification of the link function. Posterior inference in the proposed model is made feasible with the help of the Polya-Gamma data augmentation technique. We test the performance of this new model on a crash count data set from a metropolitan highway network. The empirical results show that the proposed model performs at least as well as a baseline spatial count data model with random parameters in terms of goodness of fit and site ranking ability.
Databáze: OpenAIRE