Sampling on networks: estimating spectral centrality measures and their impact in evaluating other relevant network measures


We perform an extensive analysis of how sampling impacts the estimate of several relevant network measures. In particular, we focus on how a sampling strategy optimized to recover a particular spectral centrality measure impacts other topological quantities. Our goal is on one hand to extend the analysis of the behavior of TCEC [Ruggeri2019], a theoretically-grounded sampling method for eigenvector centrality estimation. On the other hand, to demonstrate more broadly how sampling can impact the estimation of relevant network properties like centrality measures different than the one aimed at optimizing, community structure and node attribute distribution. Finally, we adapt the theoretical framework behind TCEC for the case of PageRank centrality and propose a sampling algorithm aimed at optimizing its estimation. We show that, while the theoretical derivation can be suitably adapted to cover this case, the resulting algorithm suffers of a high computational complexity that requires further approximations compared to the eigenvector centrality case.

Applied Network Science 5:81 (2020)
Nicolò Ruggeri
Nicolò Ruggeri
PhD student

My research interests include, but are not limited to, Probabilistic Learning and Network Science, as well as connected fields. In particular, I aim at understanding how current probabilistic models can be improved upon, both on a representation and training level. I am also fascinated by how different ideas and concepts from within and outside ML interpolate in interesting and novel developments. Therefore, I strive to keep a broader view on theoretical and practical insights originating from different fields.

Caterina De Bacco
Caterina De Bacco
CyberValley Research Group Leader

My research focuses on understanding, optimizing and predicting relations between the microscopic and macroscopic properties of complex large-scale interacting systems.