Performance and sensitivities of home detection from mobile phone data

Autor: Clement M. Lee, Maarten Vanhoof, Zbigniew Smoreda
Rok vydání: 2018
Předmět:
DOI: 10.48550/arxiv.1809.09911
Popis: Large-scale location based traces, such as mobile phone data, have been identified as a promising data source to complement or even enrich official statistics. In many cases, a prerequisite step to deploy the massively gathered data is the detection of home location from individual users. The problem is that little research exists on the validation (comparison with ground truth datasets) or the uncertainty estimation of home detection methods, not at individual user level, nor at nation-wide levels. In this paper, we present an extensive empirical analysis of home detection methods when performed on a nation-wide mobile phone dataset from France. We analyze the validity of 9 different Home Detection Algorithms (HDAs), and we assess different sources of uncertainty. Based on 225 different set-ups for the home detection of around 18 million users we discuss different measures for validation and investigate sensitivity to user choices such as HDA parameter choice and observation period restriction. Our findings show that nation-wide performance of home detection is moderate at best, with correlations to ground truth maximizing at 0.60 only. Additionally, we show that time and duration of observation have a clear effect on performance, and that the effect of HDA criteria and parameter choice are rather small compared to other uncertainties. Our findings and discussion offer welcoming insights to other practitioners who want to apply home detection on similar datasets, or who are in need of an assessment of the challenges and uncertainties related to mobilizing mobile phone data for official statistics.
Comment: 18 pages, 8 figures, 1 table. Winner of the student paper award of the BigSurv18 conference
Databáze: OpenAIRE