Using a Bayesian approach to reconstruct graph statistics after edge sampling

Autor: Naomi A. Arnold, Raúl J. Mondragón, Richard G. Clegg
Jazyk: angličtina
Rok vydání: 2023
Předmět:
Zdroj: Applied Network Science, Vol 8, Iss 1, Pp 1-18 (2023)
Druh dokumentu: article
ISSN: 2364-8228
DOI: 10.1007/s41109-023-00574-3
Popis: Abstract Often, due to prohibitively large size or to limits to data collecting APIs, it is not possible to work with a complete network dataset and sampling is required. A type of sampling which is consistent with Twitter API restrictions is uniform edge sampling. In this paper, we propose a methodology for the recovery of two fundamental network properties from an edge-sampled network: the degree distribution and the triangle count (we estimate the totals for the network and the counts associated with each edge). We use a Bayesian approach and show a range of methods for constructing a prior which does not require assumptions about the original network. Our approach is tested on two synthetic and three real datasets with diverse sizes, degree distributions, degree-degree correlations and triangle count distributions.
Databáze: Directory of Open Access Journals