## Statistical physics of network inference for interpretable machine learning

**Dates:** from Sept. 1, 2023 to Aug. 31, 2026

**Funder**: MINECO (Spain)

**Project id**: PID2022-142600NB-I00

**Total Funding**: 142,500€

**Node Funding**: 142,500€

**Visit the project web page**

Networks are representations of systems whose constituent parts interact with each other in non-trivial ways. They are used to model a wide range of complex systems, including physical, chemical, biological, transportation, and social systems. Within this context, network inference is the process of inferring the underlying structure or underlying properties of a network from observational data. The goal of network inference is to reconstruct the underlying network from data and to identify the key features of the network such as the presence of communities or the existence of key nodes, or to characterize the overall network structure. Network inference is an important step in the analysis of complex systems, as it allows researchers to understand the underlying mechanisms that govern the interactions between the constituent parts of the system. Even more important, it also allows researchers to make specific predictions about the systems. In this last sense, network inference has allowed network science to become a predictive (rather than just descriptive) science.

Besides helping in problems that are network problem per se, network inference is also opening new doors to tackle an increasingly important problem in statistical machine learning, namely, the lack of interpretability of models. Indeed, lack of interpretability is a source of concern in deep learning for several reasons: (i) difficulty in understanding the decision-making process of the model; (ii) difficulty in identifying and correcting biases and errors; (iii) lack of trust in the model; (iv) difficulty in complying with regulations related to data privacy and bias that require that models be interpretable; and (v) difficulty in using the model for certain applications such as healthcare and finance, where interpretability is crucial for ensuring safety and fairness. Although this problem seems, in principle, to go beyond network science, a promising approach that we have recently put forward is Bayesian symbolic regression, which leverages network representations of closed-form mathematical expressions and transforms the problem of formulating interpretable models into a network inference problem.

The general objective of this project is to develop statistical physics tools, based on network representations, network models and network inference, for statistical machine learning approaches that are interpretable as well as predictive.

### Publications

- Bayesian estimation of information-theoretic metrics for sparsely sampled distributions - Chaos, Solitons & Fractals 180 , 114564 (2024).
- Hyperedge prediction and the statistical mechanisms of higher-order and lower-order interactions in complex networks - Proc. Natl. Acad. Sci. USA 120 (50) , e2303887120 (2023).