Efficient Missing Counts Imputation of a Bike-Sharing System by Generative Adversarial Network

Autor: Shu Yang, Yunlong Zhang, Xiao Xiao, Xiaoqiang Kong
Rok vydání: 2022
Předmět:
Zdroj: IEEE Transactions on Intelligent Transportation Systems. 23:13443-13451
ISSN: 1558-0016
1524-9050
DOI: 10.1109/tits.2021.3124409
Popis: The issue of missing data is common in a bike-sharing system due to various reasons, such as the failure of data collection devices. To better utilize the bike-sharing data and guide the operation and planning of the public transportation system, missing data need to be imputed. When data are missing to a rate as high as 50%, or when the training set to calibrate a model is incomplete, many commonly used methods dealing with missing data may fail. Our concerns are how to incorporate the temporal-spatial relations from counts from a bike-sharing system and ensure a stable performance when the training set is incomplete. To solve these issues, a method is proposed using the strengths of a Generative Adversarial Network (GAN), which learns the distribution of missingness and generates data close to ground-truth values to impute the missing counts from a bike-sharing system. Traffic counts data are collected from Bluebikes, Boston. With limited available observations, the proposed method imputes missing traffic counts when concerning two scenarios: not missing at random (NMAR) problem and MCAR (Missing Completely at Random problem). The proposed method shows robustness with increasing missingness ratios in the dataset. In our experiment, the RMSE values used to measure the missing data imputation accuracy are smaller than 0.15, while the missingness ratio raises from 20% to 80%. Compared to other baseline methods, the method is robust and efficient for the missing data imputation problem for a bike-sharing system.
Databáze: OpenAIRE