SLING: A Near-Optimal Index Structure for SimRank
Autor: | Tian, Boyu, Xiao, Xiaokui |
---|---|
Rok vydání: | 2016 |
Předmět: | |
Druh dokumentu: | Working Paper |
Popis: | SimRank is a similarity measure for graph nodes that has numerous applications in practice. Scalable SimRank computation has been the subject of extensive research for more than a decade, and yet, none of the existing solutions can efficiently derive SimRank scores on large graphs with provable accuracy guarantees. In particular, the state-of-the-art solution requires up to a few seconds to compute a SimRank score in million-node graphs, and does not offer any worst-case assurance in terms of the query error. This paper presents SLING, an efficient index structure for SimRank computation. SLING guarantees that each SimRank score returned has at most $\varepsilon$ additive error, and it answers any single-pair and single-source SimRank queries in $O(1/\varepsilon)$ and $O(n/\varepsilon)$ time, respectively. These time complexities are near-optimal, and are significantly better than the asymptotic bounds of the most recent approach. Furthermore, SLING requires only $O(n/\varepsilon)$ space (which is also near-optimal in an asymptotic sense) and $O(m/\varepsilon + n\log \frac{n}{\delta}/\varepsilon^2)$ pre-computation time, where $\delta$ is the failure probability of the preprocessing algorithm. We experimentally evaluate SLING with a variety of real-world graphs with up to several millions of nodes. Our results demonstrate that SLING is up to $10000$ times (resp. $110$ times) faster than competing methods for single-pair (resp. single-source) SimRank queries, at the cost of higher space overheads. Comment: A short version of this paper will appear in SIGMOD 2016 |
Databáze: | arXiv |
Externí odkaz: |