Centrality measures allow to identify important nodes in networked systems. An open question in network theory is the empirical observation that a node’s centrality—whose computation requires knowledge of the entire network—strongly correlates with its degree (the number of its neighbors), a local observable. We address this puzzle providing an analytical derivation of the distribution of Katz centralities in random networks. Our results explain the connection between degree and centrality: For sparse networks, the distribution of centralities is a multimodal distribution where different peaks correspond to different degrees. This finding suggests that the functionality of empirical networks may be related to nodes with over- or underexpressed centrality. Our results provide a methodology for the efficient identification of such nodes.
The Katz centrality of a node in a complex network is a measure of the node’s importance as far as the flow of information across the network is concerned. For ensembles of locally tree-like undirected random graphs, this observable is a random variable. Its full probability distribution is of interest but difficult to handle analytically because of its “global” character and its definition in terms of a matrix inverse. Leveraging a fast Gaussian Belief Propagation-Cavity algorithm to solve linear systems on tree-like structures, we show that i) the Katz centrality of a single instance can be computed recursively in a very fast way, and ii) the probability that a random node in the ensemble of undirected random graphs has centrality satisfies a set of recursive distributional equations, which can be analytically characterized and efficiently solved using a population dynamics algorithm. We test our solution on ensembles of Erdős-Rényi and Scale Free networks in the locally tree-like regime, with excellent agreement. The analytical distribution of centrality for the configuration model conditioned on the degree of each node can be employed as a benchmark to identify nodes of empirical networks with over- and underexpressed centrality relative to a null baseline. We also provide an approximate formula based on a rank- projection that works well if the network is not too sparse, and we argue that an extension of our method could be efficiently extended to tackle analytical distributions of other centrality measures such as PageRank for directed networks in a transparent and user-friendly way.