Pedro A. M. Mediano, Fernando E. Rosas, Adam B. Barrett, Daniel Bor
When employing non-linear methods to characterise complex systems, it is important to determine to what extent they are capturing genuine non-linear phenomena that could not be assessed by simpler spectral methods. Specifically, we are concerned with the problem of quantifying spectral and phasic effects on an observed difference in a non-linear feature between two systems (or two states of the same system). Here we derive, from a sequence of null models, a decomposition of the difference in an observable into spectral, phasic, and spectrum-phase interaction components. Our approach makes no assumptions about the structure of the data and adds nuance to a wide range of time series analyses.
Pedro A. M. Mediano, Fernando E. Rosas, Andrea I Luppi, Robin L. Carhart-Harris, Daniel Bor, Anil K. Seth, Adam B. Barrett
Sep 27, 2021·q-bio.NC·PDF Complex systems, from the human brain to the global economy, are made of multiple elements that interact in such ways that the behaviour of the `whole' often seems to be more than what is readily explainable in terms of the `sum of the parts.' Our ability to understand and control these systems remains limited, one reason being that we still don't know how best to describe -- and quantify -- the higher-order dynamical interactions that characterise their complexity. To address this limitation, we combine principles from the theories of Information Decomposition and Integrated Information into what we call Integrated Information Decomposition, or $Φ$ID. $Φ$ID provides a comprehensive framework to reason about, evaluate, and understand the information dynamics of complex multivariate systems. $Φ$ID reveals the existence of previously unreported modes of collective information flow, providing tools to express well-known measures of information transfer and dynamical complexity as aggregates of these modes. Via computational and empirical examples, we demonstrate that $Φ$ID extends our explanatory power beyond traditional causal discovery methods -- with profound implications for the study of complex systems across disciplines.
Patricio Orio, Pedro A. M. Mediano, Fernando E. Rosas
Recent research has provided a wealth of evidence highlighting the pivotal role of high-order interdependencies in supporting the information-processing capabilities of distributed complex systems. These findings may suggest that high-order interdependencies constitute a powerful resource that is, however, challenging to harness and can be readily disrupted. In this paper we contest this perspective by demonstrating that high-order interdependencies can not only exhibit robustness to stochastic perturbations, but can in fact be enhanced by them. Using elementary cellular automata as a general testbed, our results unveil the capacity of dynamical noise to enhance the statistical regularities between agents and, intriguingly, even alter the prevailing character of their interdependencies. Furthermore, our results show that these effects are related to the high-order structure of the local rules, which affect the system's susceptibility to noise and characteristic times-scales. These results deepen our understanding of how high-order interdependencies may spontaneously emerge within distributed systems interacting with stochastic environments, thus providing an initial step towards elucidating their origin and function in complex systems like the human brain.
Fernando E. Rosas, Bernhard C. Geiger, Andrea I Luppi, Anil K. Seth, Daniel Polani, Michael Gastpar, Pedro A. M. Mediano
Understanding the functional architecture of complex systems is crucial to illuminate their inner workings and enable effective methods for their prediction and control. Recent advances have introduced tools to characterise emergent macroscopic levels; however, while these approaches are successful in identifying when emergence takes place, they are limited in the extent they can determine how it does. Here we address this limitation by developing a computational approach to emergence, which characterises macroscopic processes in terms of their computational capabilities. Concretely, we articulate a view on emergence based on how software works, which is rooted on a mathematical formalism that articulates how macroscopic processes can express self-contained informational, interventional, and computational properties. This framework establishes a hierarchy of nested self-contained processes that determines what computations take place at what level, which in turn delineates the functional architecture of a complex system. This approach is illustrated on paradigmatic models from the statistical physics and computational neuroscience literature, which are shown to exhibit macroscopic processes that are akin to software in human-engineered systems. Overall, this framework enables a deeper understanding of the multi-level structure of complex systems, revealing specific ways in which they can be efficiently simulated, predicted, and controlled.
Carles Balsells-Rodas, Toshiko Matsui, Pedro A. M. Mediano, Yixin Wang, Yingzhen Li
Identifiability is central to the interpretability of deep latent variable models, ensuring parameterisations are uniquely determined by the data-generating distribution. However, it remains underexplored for deep regime-switching time series. We develop a general theoretical framework for multi-lag Regime-Switching Models (RSMs), encompassing Markov Switching Models (MSMs) and Switching Dynamical Systems (SDSs). For MSMs, we formulate the model as a temporally structured finite mixture and prove identifiability of both the number of regimes and the multi-lag transitions in a nonlinear-Gaussian setting. For SDSs, we establish identifiability of the latent variables up to permutation and scaling via temporal structure, which in turn yields conditions for identifiability of regime-dependent latent causal graphs (up to regime/node permutations). Our results hold in a fully unsupervised setting through architectural and noise assumptions that are directly enforceable via neural network design. We complement the theory with a flexible variational estimator that satisfies the assumptions and validate the results on synthetic benchmarks. Across real-world datasets from neuroscience, finance, and climate, identifiability leads to more trustworthy interpretability analysis, which is crucial for scientific discovery.
Alberto Liardi, George Blackburne, Hardik Rajpal, Fernando E. Rosas, Pedro A. M. Mediano
Our understanding of complex systems rests on our ability to characterise how they perform distributed computation and integrate information. Advances in information theory have introduced several quantities to describe complex information structures, where collective patterns of coordination emerge from higher-order (i.e. beyond-pairwise) interdependencies. Unfortunately, the use of these approaches to study large complex systems is severely hindered by the poor scalability of existing techniques. Moreover, there are relatively few measures specifically designed for multivariate time series data. Here we introduce a novel measure of information about macroscopic structures, termed M-information, which quantifies the higher-order integration of information in complex dynamical systems. We show that M-information can be calculated via a convex optimisation problem, and we derive a robust and efficient algorithm that scales gracefully with system size. Our analyses show that M-information is resilient to noise, indexes critical behaviour in artificial neuronal populations, and reflects states of consciousness and task performance in real-world macaque and mouse neuroimaging data. Furthermore, M-information can be incorporated into existing information decomposition frameworks to reveal a comprehensive taxonomy of information dynamics. Taken together, these results help us unravel collective computation in large complex systems.
Andrea I. Luppi, Eckehard Olbrich, Conor Finn, Laura E. Suárez, Fernando E. Rosas, Pedro A. M. Mediano, Jürgen Jost
Understanding how different networks relate to each other is key for obtaining a greater insight into complex systems. Here, we introduce an intuitive yet powerful framework to characterise the relationship between two networks comprising the same nodes. We showcase our framework by decomposing the shortest paths between nodes as being contributed uniquely by one or the other source network, or redundantly by either, or synergistically by the two together. Our approach takes into account the networks' full topology, and it also provides insights at multiple levels of resolution: from global statistics, to individual paths of different length. We show that this approach is widely applicable, from brains to the London public transport system. In humans and across 123 other mammalian species, we demonstrate that reliance on unique contributions by long-range white matter fibers is a conserved feature of mammalian structural brain networks. Across species, we also find that efficient communication relies on significantly greater synergy between long-range and short-range fibers than expected by chance, and significantly less redundancy. Our framework may find applications to help decide how to trade-off different desiderata when designing network systems, or to evaluate their relative presence in existing systems, whether biological or artificial.
Pedro A. M. Mediano, Fernando Rosas, Robin L. Carhart-Harris, Anil K. Seth, Adam B. Barrett
Most information dynamics and statistical causal analysis frameworks rely on the common intuition that causal interactions are intrinsically pairwise -- every 'cause' variable has an associated 'effect' variable, so that a 'causal arrow' can be drawn between them. However, analyses that depict interdependencies as directed graphs fail to discriminate the rich variety of modes of information flow that can coexist within a system. This, in turn, creates problems with attempts to operationalise the concepts of 'dynamical complexity' or `integrated information.' To address this shortcoming, we combine concepts of partial information decomposition and integrated information, and obtain what we call Integrated Information Decomposition, or $Φ$ID. We show how $Φ$ID paves the way for more detailed analyses of interdependencies in multivariate time series, and sheds light on collective modes of information dynamics that have not been reported before. Additionally, $Φ$ID reveals that what is typically referred to as 'integration' is actually an aggregate of several heterogeneous phenomena. Furthermore, $Φ$ID can be used to formulate new, tailored measures of integrated information, as well as to understand and alleviate the limitations of existing measures.
Zafeirios Fountas, Noor Sajid, Pedro A. M. Mediano, Karl Friston
Active inference is a Bayesian framework for understanding biological intelligence. The underlying theory brings together perception and action under one single imperative: minimizing free energy. However, despite its theoretical utility in explaining intelligence, computational implementations have been restricted to low-dimensional and idealized situations. In this paper, we present a neural architecture for building deep active inference agents operating in complex, continuous state-spaces using multiple forms of Monte-Carlo (MC) sampling. For this, we introduce a number of techniques, novel to active inference. These include: i) selecting free-energy-optimal policies via MC tree search, ii) approximating this optimal policy distribution via a feed-forward `habitual' network, iii) predicting future parameter belief updates using MC dropouts and, finally, iv) optimizing state transition precision (a high-end form of attention). Our approach enables agents to learn environmental dynamics efficiently, while maintaining task performance, in relation to reward-based counterparts. We illustrate this in a new toy environment, based on the dSprites data-set, and demonstrate that active inference agents automatically create disentangled representations that are apt for modeling state transitions. In a more complex Animal-AI environment, our agents (using the same neural architecture) are able to simulate future state transitions and actions (i.e., plan), to evince reward-directed navigation - despite temporary suspension of visual input. These results show that deep active inference - equipped with MC methods - provides a flexible framework to develop biologically-inspired intelligent agents, with applications in both machine learning and cognitive science.
Fernando Rosas, Pedro A. M. Mediano, Martin Ugarte, Henrik J. Jensen
Self-organisation lies at the core of fundamental but still unresolved scientific questions, and holds the promise of de-centralised paradigms crucial for future technological developments. While self-organising processes have been traditionally explained by the tendency of dynamical systems to evolve towards specific configurations, or attractors, we see self-organisation as a consequence of the interdependencies that those attractors induce. Building on this intuition, in this work we develop a theoretical framework for understanding and quantifying self-organisation based on coupled dynamical systems and multivariate information theory. We propose a metric of global structural strength that identifies when self-organisation appears, and a multi-layered decomposition that explains the emergent structure in terms of redundant and synergistic interdependencies. We illustrate our framework on elementary cellular automata, showing how it can detect and characterise the emergence of complex structures.
Adam B. Barrett, Pedro A. M. Mediano
Feb 12, 2019·q-bio.NC·PDF According to the Integrated Information Theory of Consciousness, consciousness is a fundamental observer-independent property of physical systems, and the measure Phi of integrated information is identical to the quantity or level of consciousness. For this to be plausible, there should be no alternative formulae for Phi consistent with the axioms of IIT, and there should not be cases of Phi being ill-defined. This article presents three ways in which Phi, in its current formulation, fails to meet these standards, and discusses how this problem might be addressed.
Aaron J. Gutknecht, Fernando E. Rosas, David A. Ehrlich, Abdullah Makkeh, Pedro A. M. Mediano, Michael Wibral
Distributed systems, such as biological and artificial neural networks, process information via complex interactions engaging multiple subsystems, resulting in high-order patterns with distinct properties across scales. Investigating how these systems process information remains challenging due to difficulties in defining appropriate multivariate metrics and ensuring their scalability to large systems. To address these challenges, we introduce a novel framework based on what we call "Shannon invariants" -- quantities that capture essential properties of high-order information processing in a way that depends only on the definition of entropy and can be efficiently calculated for large systems. Our theoretical results demonstrate how Shannon invariants can be used to resolve long-standing ambiguities regarding the interpretation of widely used multivariate information-theoretic measures. Moreover, our practical results reveal distinctive information-processing signatures of various deep learning architectures across layers, which lead to new insights into how these systems process information and how this evolves during training. Overall, our framework resolves fundamental limitations in analyzing high-order phenomena and offers broad opportunities for theoretical developments and empirical analyses.
Fernando E. Rosas, Aaron Gutknecht, Pedro A. M. Mediano, Michael Gastpar
High-order phenomena play crucial roles in many systems of interest, but their analysis is often highly nontrivial. There is a rich literature providing a number of alternative information-theoretic quantities capturing high-order phenomena, but their interpretation and relationship with each other is not well understood. The lack of principles unifying these quantities obscures the choice of tools for enabling specific type of analyses. Here we show how an entropic conjugation provides a theoretically grounded principle to investigate the space of possible high-order quantities, clarifying the nature of the existent metrics while revealing gaps in the literature. This leads to identify novel notions of symmetry and skew-symmetry as key properties for guaranteeing a balanced account of high-order interdependencies and enabling broadly applicable analyses across physical systems.
Tom Yates, Yuzhou Cheng, Ignacio Alzugaray, Danyal Akarca, Pedro A. M. Mediano, Andrew J. Davison
Belief Propagation (BP) is a powerful algorithm for distributed inference in probabilistic graphical models, however it quickly becomes infeasible for practical compute and memory budgets. Many efficient, non-parametric forms of BP have been developed, but the most popular is Gaussian Belief Propagation (GBP), a variant that assumes all distributions are locally Gaussian. GBP is widely used due to its efficiency and empirically strong performance in applications like computer vision or sensor networks - even when modelling non-Gaussian problems. In this paper, we seek to provide a theoretical guarantee for when Gaussian approximations are valid in highly non-Gaussian, sparsely-connected factor graphs performing BP (common in spatial AI). We leverage the Central Limit Theorem (CLT) to prove mathematically that variables' beliefs under BP converge to a Gaussian distribution in complex, loopy factor graphs obeying our 4 key assumptions. We then confirm experimentally that variable beliefs become increasingly Gaussian after just a few BP iterations in a stereo depth estimation task.
Alberto Liardi, Madalina I. Sas, George Blackburne, William J. Knottenbelt, Pedro A. M. Mediano, Henrik Jeldtoft Jensen
Understanding a complex system entails capturing the non-trivial collective phenomena that arise from interactions between its different parts. Information theory is a flexible and robust framework to study such behaviours, with several measures designed to quantify and characterise the interdependencies among the system's components. However, since these estimators rely on the statistical distributions of observed quantities, it is crucial to examine the relationships between information-theoretic measures and the system's underlying mechanistic structure. To this end, here we present an information-theoretic analytical investigation of an elementary system of interactive random walkers subject to Gaussian noise. Focusing on partial information decomposition, causal emergence, and integrated information, our results help us develop some intuitions on their relationship with the physical parameters of the system. For instance, we observe that uncoupled systems can exhibit emergent properties, in a way that we suggest may be better described as ''statistically autonomous''. Overall, we observe that in this simple scenario information measures align more reliably with the system's mechanistic properties when calculated at the level of microscopic components, rather than their coarse-grained counterparts, and over timescales comparable with the system's intrinsic dynamics. Moreover, we show that approaches that separate the contributions of the system's dynamics and steady-state distribution (e.g. via causal perturbations) may help strengthen the interpretation of information-theoretic analyses.
Leyla Roksan Caglar, Pedro A. M. Mediano, Baihan Lin
Humans and modern vision models can reach similar classification accuracy while making systematically different kinds of mistakes - differing not in how often they err, but in who gets mistaken for whom, and in which direction. We show that these directional confusions reveal distinct inductive biases that are invisible to accuracy alone. Using matched human and deep vision model responses on a natural-image categorization task under 12 perturbation types, we quantify asymmetry in confusion matrices and link it to generalization geometry through a Rate-Distortion (RD) framework, summarized by three geometric signatures (slope (beta), curvature (kappa)) and efficiency (AUC). We find that humans exhibit broad but weak asymmetries, whereas deep vision models show sparser, stronger directional collapses. Robustness training reduces global asymmetry but fails to recover the human-like breadth-strength profile of graded similarity. Mechanistic simulations further show that different asymmetry organizations shift the RD frontier in opposite directions, even when matched for performance. Together, these results position directional confusions and RD geometry as compact, interpretable signatures of inductive bias under distribution shift.
Hardik Rajpal, Clem von Stengel, Pedro A. M. Mediano, Fernando E. Rosas, Eduardo Viegas, Pablo A. Marquet, Henrik J. Jensen
Oct 31, 2023·q-bio.PE·PDF At what level does selective pressure effectively act? When considering the reproductive dynamics of interacting and mutating agents, it has long been debated whether selection is better understood by focusing on the individual or if hierarchical selection emerges as a consequence of joint adaptation. Despite longstanding efforts in theoretical ecology there is still no consensus on this fundamental issue, most likely due to the difficulty in obtaining adequate data spanning sufficient number of generations and the lack of adequate tools to quantify the effect of hierarchical selection. Here we capitalise on recent advances in information-theoretic data analysis to advance this state of affairs by investigating the emergence of high-order structures -- such as groups of species -- in the collective dynamics of the Tangled Nature model of evolutionary ecology. Our results show that evolutionary dynamics can lead to clusters of species that act as a selective group, that acquire information-theoretic agency. Overall, our findings provide quantitative evidence supporting the relevance of high-order structures in evolutionary ecology, which can emerge even from relatively simple processes of adaptation and selection.
Fernando E. Rosas, Pedro A. M. Mediano, Michael Gastpar
Systems of interest for theoretical or experimental work often exhibit high-order interactions, corresponding to statistical interdependencies in groups of variables that cannot be reduced to dependencies in subsets of them. While still under active development, the framework of partial information decomposition (PID) has emerged as the dominant approach to conceptualise and calculate high-order interdependencies. PID approaches can be grouped in two types: directed approaches that divide variables into sources and targets, and undirected approaches that treat all variables equally. Directed and undirected approaches are usually employed to investigate different scenarios, and hence little is known about how these two types of approaches may relate to each other, or if their corresponding quantities are linked in some way. In this paper we investigate the relationship between the redundancy-synergy index (RSI) and the O-information, which are practical metrics of directed and undirected high-order interdependencies, respectively. Our results reveal tight links between these two quantities, and provide interpretations of them in terms of likelihood ratios in a hypothesis testing setting, as well as in terms of projections in information geometry.
Zhaolu Liu, Robert L. Peach, Pedro A. M. Mediano, Mauricio Barahona
Models that rely solely on pairwise relationships often fail to capture the complete statistical structure of the complex multivariate data found in diverse domains, such as socio-economic, ecological, or biomedical systems. Non-trivial dependencies between groups of more than two variables can play a significant role in the analysis and modelling of such systems, yet extracting such high-order interactions from data remains challenging. Here, we introduce a hierarchy of $d$-order ($d \geq 2$) interaction measures, increasingly inclusive of possible factorisations of the joint probability distribution, and define non-parametric, kernel-based tests to establish systematically the statistical significance of $d$-order interactions. We also establish mathematical links with lattice theory, which elucidate the derivation of the interaction measures and their composite permutation tests; clarify the connection of simplicial complexes with kernel matrix centring; and provide a means to enhance computational efficiency. We illustrate our results numerically with validations on synthetic data, and through an application to neuroimaging data.
Pedro A. M. Mediano, Anil K. Seth, Adam B. Barrett
Jun 25, 2018·q-bio.NC·PDF Integrated Information Theory (IIT) is a prominent theory of consciousness that has at its centre measures that quantify the extent to which a system generates more information than the sum of its parts. While several candidate measures of integrated information (`$Φ$') now exist, little is known about how they compare, especially in terms of their behaviour on non-trivial network models. In this article we provide clear and intuitive descriptions of six distinct candidate measures. We then explore the properties of each of these measures in simulation on networks consisting of eight interacting nodes, animated with Gaussian linear autoregressive dynamics. We find a striking diversity in the behaviour of these measures -- no two measures show consistent agreement across all analyses. Further, only a subset of the measures appear to genuinely reflect some form of dynamical complexity, in the sense of simultaneous segregation and integration between system components. Our results help guide the operationalisation of IIT and advance the development of measures of integrated information that may have more general applicability.