Learning Asynchronous-Time Information Diffusion Models and its Application to Behavioral Data Analysis over Social Networks
Autor: | Saito, Kazumi, Kimura, Masahiro, Ohara, Kouzou, Motoda, Hiroshi |
---|---|
Rok vydání: | 2012 |
Předmět: | |
Druh dokumentu: | Working Paper |
Popis: | One of the interesting and important problems of information diffusion over a large social network is to identify an appropriate model from a limited amount of diffusion information. There are two contrasting approaches to model information diffusion: a push type model known as Independent Cascade (IC) model and a pull type model known as Linear Threshold (LT) model. We extend these two models (called AsIC and AsLT in this paper) to incorporate asynchronous time delay and investigate 1) how they differ from or similar to each other in terms of information diffusion, 2) whether the model itself is learnable or not from the observed information diffusion data, and 3) which model is more appropriate to explain for a particular topic (information) to diffuse/propagate. We first show there can be variations with respect to how the time delay is modeled, and derive the likelihood of the observed data being generated for each model. Using one particular time delay model, we show the model parameters are learnable from a limited amount of observation. We then propose a method based on predictive accuracy by which to select a model which better explains the observed data. Extensive evaluations were performed. We first show using synthetic data with the network structures taken from real networks that there are considerable behavioral differences between the AsIC and the AsLT models, the proposed methods accurately and stably learn the model parameters, and identify the correct diffusion model from a limited amount of observation data. We next apply these methods to behavioral analysis of topic propagation using the real blog propagation data, and show there is a clear indication as to which topic better follows which model although the results are rather insensitive to the model selected at the level of discussing how far and fast each topic propagates from the learned parameter values. Comment: 39 pages, 55 figures |
Databáze: | arXiv |
Externí odkaz: |