A generative model for reciprocity and community detection in networks

Abstract

We present a probabilistic generative model and efficient algorithm to model reciprocity in directed networks. Unlike other methods that address this problem such as exponential random graphs, it assigns latent variables as community memberships to nodes and a reciprocity parameter to the whole network rather than fitting order statistics. It formalizes the assumption that a directed interaction is more likely to occur if an individual has already observed an interaction towards her. It provides a natural framework for relaxing the common assumption in network generative models of conditional independence between edges, and it can be used to perform inference tasks such as predicting the existence of an edge given the observation of an edge in the reverse direction. Inference is performed using an efficient expectation-maximization algorithm that exploits the sparsity of the network, leading to an efficient and scalable implementation. We illustrate these findings by analyzing synthetic and real data, including social networks, academic citations and the Erasmus student exchange program. Our method outperforms others in both predicting edges and generating networks that reflect the reciprocity values observed in real data, while at the same time inferring an underlying community structure. We provide an open-source implementation of the code online.

Publication
Submitted
Hadiseh Safdari
Hadiseh Safdari
Postdoctoral researcher

My current research revolves around inference and modeling in networks. More precisely, we aim to relax the independence assumptions in generative models by deploying hidden variables, and establishing analytical approximations to make the inference problem tractable.

Martina Contisciani
Martina Contisciani
PhD student

My research focuses on the analysis of network data using statistical tools. My background is in Theoretical and Applied Statistics and I am interested in discovering new techniques, approaches and perspectives used in the analysis of data. I have been working on a project focused on modeling covariate information in community detection algorithms and I am involved in investigating the conditional independence assumption, underlying the statistical inference on network data.

Caterina De Bacco
Caterina De Bacco
CyberValley Research Group Leader

My research focuses on understanding, optimizing and predicting relations between the microscopic and macroscopic properties of complex large-scale interacting systems.

Related