Reconstructing degree distribution and triangle counts from edge-sampled graphs

Published in Complex Networks 2022, 2022

Recommended citation: Arnold, Naomi A., Raul J. Mondragón, and Richard G. Clegg. "Reconstructing degree distribution and triangle counts from edge-sampled graphs." Proceedings of the Complex Networks Conference (2022) https://arxiv.org/abs/2207.11793

Abstract

Often, due to prohibitively large size or to limits to data collecting APIs, it is not possible to work with a complete network dataset and sampling is required. A type of sampling which is consistent with Twitter API restrictions is uniform edge sampling. In this paper, we propose a methodology for the recovery of two fundamental network properties from an edge-sampled network: the degree distribution and the triangle count (we estimate the totals for the network and the counts associated with each edge). We use a Bayesian approach and show a range of methods for constructing a prior which does not require assumptions about the original network. Our approach is tested on two synthetic and two real datasets with diverse degree and triangle count distributions.