Popis: |
Online social networks are being increasingly used for analyzing various societal phenomena such as epidemiology, information dissemination, marketing and sentiment flow. Popular analysis techniques such as clustering and influential node analysis, require the computation of eigenvectors of the real graph's adjacency matrix. Recent de-anonymization attacks on Netflix and AOL datasets show that an open access to such graphs pose privacy threats. Among the various privacy preserving models, Differential privacy provides the strongest privacy guarantees. In this paper we propose a privacy preserving mechanism for publishing social network graph data, which satisfies differential privacy guarantees by utilizing a combination of theory of random matrix and that of differential privacy. The key idea is to project each row of an adjacency matrix to a low dimensional space using the random projection approach and then perturb the projected matrix with random noise. We show that as compared to existing approaches for differential private approximation of eigenvectors, our approach is computationally efficient, preserves the utility and satisfies differential privacy. We evaluate our approach on social network graphs of Facebook, Live Journal and Pokec. The results show that even for high values of noise variance sigma=1 the clustering quality given by normalized mutual information gain is as low as 0.74. For influential node discovery, the propose approach is able to correctly recover 80 of the most influential nodes. We also compare our results with an approach presented in [43], which directly perturbs the eigenvector of the original data by a Laplacian noise. The results show that this approach requires a large random perturbation in order to preserve the differential privacy, which leads to a poor estimation of eigenvectors for large social networks. |