A Unified Framework for User Identification Across Online and Offline Data

Autor: Haishan Wu, Jingbo Zhou, Yunsheng Cheng, Longbo Huang, Tianyi Hao
Rok vydání: 2022
Předmět:
Zdroj: IEEE Transactions on Knowledge and Data Engineering. 34:1562-1575
ISSN: 2326-3865
1041-4347
DOI: 10.1109/tkde.2020.3000287
Popis: User identification across multiple datasets has a wide range of applications and there has been an increasing set of research works on this topic during recent years. However, most of existing works focus on user identification with a single input data type, e.g., (I) identifying a user across multiple social networks with online data and (II) detecting a single user from heterogeneous trajectory datasets with offline data. Different from previous works, in this paper, we propose a framework on user identification between online and offline datasets. We build connections between these two types of data by a mapping from IP addresses to physical locations. To solve this problem, we propose a novel framework consists of three steps. First, we use a clustering method based on locations of IP addresses to map IP addresses into specific physical location distributions. Second, we propose a novel pairwise index to reduce space cost and running time for computing the co-occurrence. Lastly, we apply a learning-to-rank method to merge the effect of multiple features we get in the first two steps. Based on our framework, we design experiments to demonstrate the efficiency (in time and space) of our framework, together with the precision and recall of our approach compared to other methods.
Databáze: OpenAIRE