A New Spatial Count Data Model with Bayesian Additive Regression Trees for Accident Hot Spot Identification
Autor: | Prasad Buddhavarapu, Rico Krueger, Prateek Bansal |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2020 |
Předmět: |
site ranking
FOS: Computer and information sciences Safety Management negative binomial regression Computer science statistical-analysis Bayesian probability unobserved heterogeneity Negative binomial distribution Inference Human Factors and Ergonomics random-parameters spatial count data modelling computer.software_genre polya-gamma data augmentation Statistics - Applications bayesian additive regression trees Goodness of fit 0502 economics and business Humans 0501 psychology and cognitive sciences support vector machine Applications (stat.AP) negative binomial model empirical bayes Built Environment Safety Risk Reliability and Quality 050107 human factors 050210 logistics & transportation Spatial Analysis Models Statistical 05 social sciences Public Health Environmental and Occupational Health Probabilistic logic Accidents Traffic crash-frequency neural-network Bayes Theorem prediction transportation safety Support vector machine accident analysis Ranking Data mining Safety computer Count data |
Popis: | The identification of accident hot spots is a central task of road safety management. Bayesian count data models have emerged as the workhorse method for producing probabilistic rankings of hazardous sites in road networks. Typically, these methods assume simple linear link function specifications, which, however, limit the predictive power of a model. Furthermore, extensive specification searches are precluded by complex model structures arising from the need to account for unobserved heterogeneity and spatial correlations. Modern machine learning (ML) methods offer ways to automate the specification of the link function. However, these methods do not capture estimation uncertainty, and it is also difficult to incorporate spatial correlations. In light of these gaps in the literature, this paper proposes a new spatial negative binomial model, which uses Bayesian additive regression trees to endogenously select the specification of the link function. Posterior inference in the proposed model is made feasible with the help of the Polya-Gamma data augmentation technique. We test the performance of this new model on a crash count data set from a metropolitan highway network. The empirical results show that the proposed model performs at least as well as a baseline spatial count data model with random parameters in terms of goodness of fit and site ranking ability. |
Databáze: | OpenAIRE |
Externí odkaz: |