The impact of privacy protection measures on the utility of crowdsourced cycling data
Autor: | Mark Livingston, Varun Raturi, David Philip McArthur, Jinhyun Hong |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
Information privacy
business.industry Computer science Geography Planning and Development Privacy protection Transportation Information loss Crowdsourcing computer.software_genre Popularity Domain (software engineering) Data mining business Spatial analysis computer General Environmental Science Data compression |
ISSN: | 0966-6923 |
Popis: | The use of new forms of data in the transport research domain is rapidly gaining popularity. However, these data come with specific challenges and one of the major concerns is maintaining the privacy of data subjects. One widely used approach to anonymise the data is to apply binning. Recently, data from activity-tracking applications like Strava has been utilised to study and analyse active travel. Due to privacy concerns, Strava has started providing data in a discretised format from July 2018. In this study, we aim to analyse the impact of the binning criteria on the utility of the crowdsourced data by using Strava data from 2013 to 2016 for the city of Glasgow. We applied the Strava binning criteria on the original dataset at three different temporal aggregations (i.e., Hourly, Daily and Monthly) and conducted different analyses to examine its impacts. First, we compared manual cycling counts with original and binned cycling counts from Strava data. Second, net-errors were calculated by comparing original and binned cycling counts from Strava data. Third, we estimated spatial autocorrelation statistics based on original and binned Strava counts and investigated the extent to which research outcomes change because of the binning approach. Our results confirmed significant amount of information loss. Worryingly, we also show that conclusions reached by previous studies could have been reversed if the new specification of the data had been used. We outline here what precautions researchers and planners should take when working with the binned data. |
Databáze: | OpenAIRE |
Externí odkaz: |