Popis: |
Online social network platforms such as Twitter and Sina Weibo have been extremely popular over the past 20 years. Identifying the network community of a social platform is essential to exploring and understanding the users' interests. However, the rapid development of science and technology has generated large amounts of social network data, creating great computational challenges for community detection in large-scale social networks. Here, we propose a novel subsampling spectral clustering algorithm to identify community structures in large-scale social networks with limited computing resources. More precisely, spectral clustering is conducted using only the information of a small subsample of the network nodes, resulting in a huge reduction in computational time. As a result, for large-scale datasets, the method can be realized even using a personal computer. Specifically, we introduce two different sampling techniques, namely simple random subsampling and degree corrected subsampling. The methodology is applied to the dataset collected from Sina Weibo, which is one of the largest Twitter-type social network platforms in China. Our method can very effectively identify the community structure of registered users. This community structure information can be applied to help Sina Weibo promote advertisements to target users and increase user activity. |