Generating synthetic social graphs with Darwini
Autor: | Maja Kabiljo, Dionysios Logothetis, Sergey Edunov, Avery Ching, Cheng Wang |
---|---|
Rok vydání: | 2018 |
Předmět: |
Theoretical computer science
Distribution (number theory) Computer science Assortativity 020207 software engineering Scale (descriptive set theory) 010103 numerical & computational mathematics 02 engineering and technology Degree distribution 01 natural sciences Graph Core (graph theory) 0202 electrical engineering electronic engineering information engineering Graph algorithms 0101 mathematics Cluster analysis Eigenvalues and eigenvectors MathematicsofComputing_DISCRETEMATHEMATICS Clustering coefficient |
Zdroj: | ICDCS |
ISSN: | 2516-2314 |
Popis: | Synthetic graph generators facilitate research in graph algorithms and graph processing systems by providing access to graphs that resemble real social networks while addressing privacy and security concerns. Nevertheless, their practical value lies in their ability to capture important metrics of real graphs, such as degree distribution and clustering properties. Graph generators must also be able to produce such graphs at the scale of real-world industry graphs, that is, hundreds of billions or trillions of edges.In this paper, we propose Darwini, a graph generator that captures a number of core characteristics of real graphs. Importantly, given a source graph, it can reproduce the degree distribution and, unlike existing approaches, the local clustering coefficient distribution. Furthermore, Darwini maintains a number of metrics, such as graph assortativity, eigenvalues, and others. Comparing Darwini with state-of-the-art generative models, we show that it can reproduce these characteristics more accurately. Finally, we provide an open source implementation of Darwini on the vertex-centric Apache GiraphTMmodel that can generate synthetic graphs with up to 3 trillion edges. |
Databáze: | OpenAIRE |
Externí odkaz: |