Alexandra M. Proca, Fernando E. Rosas, Andrea I. Luppi, Daniel Bor, Matthew Crosby, Pedro A. M. Mediano
Striking progress has recently been made in understanding human cognition by analyzing how its neuronal underpinnings are engaged in different modes of information processing. Specifically, neural information can be decomposed into synergistic, redundant, and unique features, with synergistic components being particularly aligned with complex cognition. However, two fundamental questions remain unanswered: (a) precisely how and why a cognitive system can become highly synergistic; and (b) how these informational states map onto artificial neural networks in various learning modes. To address these questions, here we employ an information-decomposition framework to investigate the information processing strategies adopted by simple artificial neural networks performing a variety of cognitive tasks in both supervised and reinforcement learning settings. Our results show that synergy increases as neural networks learn multiple diverse tasks. Furthermore, performance in tasks requiring integration of multiple information sources critically relies on synergistic neurons. Finally, randomly turning off neurons during training through dropout increases network redundancy, corresponding to an increase in robustness. Overall, our results suggest that while redundant information is required for robustness to perturbations in the learning process, synergistic information is used to combine information from multiple modalities -- and more generally for flexible and efficient learning. These findings open the door to new ways of investigating how and why learning systems employ specific information-processing strategies, and support the principle that the capacity for general-purpose learning critically relies in the system's information dynamics.
Fernando E. Rosas, Diego Candia-Rivera, Andrea I Luppi, Yike Guo, Pedro A. M. Mediano
Recent research is revealing how cognitive processes are supported by a complex interplay between the brain and the rest of the body, which can be investigated by the analysis of physiological features such as breathing rhythms, heart rate, and skin conductance. Heart rate dynamics are of particular interest as they provide a way to track the sympathetic and parasympathetic outflow from the autonomic nervous system, which is known to play a key role in modulating attention, memory, decision-making, and emotional processing. However, extracting useful information from heartbeats about the autonomic outflow is still challenging due to the noisy estimates that result from standard signal-processing methods. To advance this state of affairs, we propose a paradigm shift in how we conceptualise and model heart rate: instead of being a mere summary of the observed inter-beat intervals, we introduce a modelling framework that views heart rate as a hidden stochastic process that drives the observed heartbeats. Moreover, by leveraging the rich literature of state-space modelling and Bayesian inference, our proposed framework delivers a description of heart rate dynamics that is not a point estimate but a posterior distribution of a generative model. We illustrate the capabilities of our method by showing that it recapitulates linear properties of conventional heart rate estimators, while exhibiting a better discriminative power for metrics of dynamical complexity compared across different physiological states.
Pedro Urbina-Rodriguez, Zafeirios Fountas, Fernando E. Rosas, Jun Wang, Andrea I. Luppi, Haitham Bou-Ammar, Murray Shanahan, Pedro A. M. Mediano
The independent evolution of intelligence in biological and artificial systems offers a unique opportunity to identify its fundamental computational principles. Here we show that large language models spontaneously develop synergistic cores -- components where information integration exceeds individual parts -- remarkably similar to those in the human brain. Using principles of information decomposition across multiple LLM model families and architectures, we find that areas in middle layers exhibit synergistic processing while early and late layers rely on redundancy, mirroring the informational organisation in biological brains. This organisation emerges through learning and is absent in randomly initialised networks. Crucially, ablating synergistic components causes disproportionate behavioural changes and performance loss, aligning with theoretical predictions about the fragility of synergy. Moreover, fine-tuning synergistic regions through reinforcement learning yields significantly greater performance gains than training redundant components, yet supervised fine-tuning shows no such advantage. This convergence suggests that synergistic information processing is a fundamental property of intelligence, providing targets for principled model design and testable predictions for biological intelligence.
Pedro A. M. Mediano, Fernando E. Rosas, Andrea I. Luppi, Henrik J. Jensen, Anil K. Seth, Adam B. Barrett, Robin L. Carhart-Harris, Daniel Bor
Nov 12, 2021·q-bio.NC·PDF Emergence is a profound subject that straddles many scientific disciplines, including the formation of galaxies and how consciousness arises from the collective activity of neurons. Despite the broad interest that exists on this concept, the study of emergence has suffered from a lack of formalisms that could be used to guide discussions and advance theories. Here we summarise, elaborate on, and extend a recent formal theory of causal emergence based on information decomposition, which is quantifiable and amenable to empirical testing. This theory relates emergence with information about a system's temporal evolution that cannot be obtained from the parts of the system separately. This article provides an accessible but rigorous introduction to the framework, discussing the merits of the approach in various scenarios of interest. We also discuss several interpretation issues and potential misunderstandings, while highlighting the distinctive benefits of this formalism.
Hanna M. Tolle, Andrea I Luppi, Anil K. Seth, Pedro A. M. Mediano
Jun 27, 2024·q-bio.NC·PDF Biological neural networks can perform complex computations to predict their environment, far above the limited predictive capabilities of individual neurons. While conventional approaches to understanding these computations often focus on isolating the contributions of single neurons, here we argue that a deeper understanding requires considering emergent dynamics - dynamics that make the whole system "more than the sum of its parts". Specifically, we examine the relationship between prediction performance and emergence by leveraging recent quantitative metrics of emergence, derived from Partial Information Decomposition, and by modelling the prediction of environmental dynamics in a bio-inspired computational framework known as reservoir computing. Notably, we reveal a bidirectional coupling between prediction performance and emergence, which generalises across task environments and reservoir network topologies, and is recapitulated by three key results: 1) Optimising hyperparameters for performance enhances emergent dynamics, and vice versa; 2) Emergent dynamics represent a near sufficient criterion for prediction success in all task environments, and an almost necessary criterion in most environments; 3) Training reservoir computers on larger datasets results in stronger emergent dynamics, which contain task-relevant information crucial for performance. Overall, our study points to a pivotal role of emergence in facilitating environmental predictions in a bio-inspired computational architecture.
Andrea I. Luppi, Eckehard Olbrich, Conor Finn, Laura E. Suárez, Fernando E. Rosas, Pedro A. M. Mediano, Jürgen Jost
Understanding how different networks relate to each other is key for obtaining a greater insight into complex systems. Here, we introduce an intuitive yet powerful framework to characterise the relationship between two networks comprising the same nodes. We showcase our framework by decomposing the shortest paths between nodes as being contributed uniquely by one or the other source network, or redundantly by either, or synergistically by the two together. Our approach takes into account the networks' full topology, and it also provides insights at multiple levels of resolution: from global statistics, to individual paths of different length. We show that this approach is widely applicable, from brains to the London public transport system. In humans and across 123 other mammalian species, we demonstrate that reliance on unique contributions by long-range white matter fibers is a conserved feature of mammalian structural brain networks. Across species, we also find that efficient communication relies on significantly greater synergy between long-range and short-range fibers than expected by chance, and significantly less redundancy. Our framework may find applications to help decide how to trade-off different desiderata when designing network systems, or to evaluate their relative presence in existing systems, whether biological or artificial.
Pedro A. M. Mediano, Fernando E. Rosas, Andrea I Luppi, Robin L. Carhart-Harris, Daniel Bor, Anil K. Seth, Adam B. Barrett
Sep 27, 2021·q-bio.NC·PDF Complex systems, from the human brain to the global economy, are made of multiple elements that interact in such ways that the behaviour of the `whole' often seems to be more than what is readily explainable in terms of the `sum of the parts.' Our ability to understand and control these systems remains limited, one reason being that we still don't know how best to describe -- and quantify -- the higher-order dynamical interactions that characterise their complexity. To address this limitation, we combine principles from the theories of Information Decomposition and Integrated Information into what we call Integrated Information Decomposition, or $Φ$ID. $Φ$ID provides a comprehensive framework to reason about, evaluate, and understand the information dynamics of complex multivariate systems. $Φ$ID reveals the existence of previously unreported modes of collective information flow, providing tools to express well-known measures of information transfer and dynamical complexity as aggregates of these modes. Via computational and empirical examples, we demonstrate that $Φ$ID extends our explanatory power beyond traditional causal discovery methods -- with profound implications for the study of complex systems across disciplines.
Fernando E. Rosas, Bernhard C. Geiger, Andrea I Luppi, Anil K. Seth, Daniel Polani, Michael Gastpar, Pedro A. M. Mediano
Understanding the functional architecture of complex systems is crucial to illuminate their inner workings and enable effective methods for their prediction and control. Recent advances have introduced tools to characterise emergent macroscopic levels; however, while these approaches are successful in identifying when emergence takes place, they are limited in the extent they can determine how it does. Here we address this limitation by developing a computational approach to emergence, which characterises macroscopic processes in terms of their computational capabilities. Concretely, we articulate a view on emergence based on how software works, which is rooted on a mathematical formalism that articulates how macroscopic processes can express self-contained informational, interventional, and computational properties. This framework establishes a hierarchy of nested self-contained processes that determines what computations take place at what level, which in turn delineates the functional architecture of a complex system. This approach is illustrated on paradigmatic models from the statistical physics and computational neuroscience literature, which are shown to exhibit macroscopic processes that are akin to software in human-engineered systems. Overall, this framework enables a deeper understanding of the multi-level structure of complex systems, revealing specific ways in which they can be efficiently simulated, predicted, and controlled.
Fernando E. Rosas, Pedro A. M. Mediano, Andrea I. Luppi, Thomas F. Varley, Joseph T. Lizier, Sebastiano Stramaglia, Henrik J. Jensen, Daniele Marinazzo
Battiston et al. (arXiv:2110.06023) provide a comprehensive overview of how investigations of complex systems should take into account interactions between more than two elements, which can be modelled by hypergraphs and studied via topological data analysis. Following a separate line of enquiry, a broad literature has developed information-theoretic tools to characterize high-order interdependencies from observed data. While these could seem to be competing approaches aiming to address the same question, in this correspondence we clarify that this is not the case, and that a complete account of higher-order phenomena needs to embrace both.
Andrea I Luppi, Fernando E. Rosas, Gustavo Deco, Morten L. Kringelbach, Pedro A. M. Mediano
Aug 10, 2023·q-bio.NC·PDF Temporal irreversibility, often referred to as the arrow of time, is a fundamental concept in statistical mechanics. Markers of irreversibility also provide a powerful characterisation of information processing in biological systems. However, current approaches tend to describe temporal irreversibility in terms of a single scalar quantity, without disentangling the underlying dynamics that contribute to irreversibility. Here we propose a broadly applicable information-theoretic framework to characterise the arrow of time in multivariate time series, which yields qualitatively different types of irreversible information dynamics. This multidimensional characterisation reveals previously unreported high-order modes of irreversibility, and establishes a formal connection between recent heuristic markers of temporal irreversibility and metrics of information processing. We demonstrate the prevalence of high-order irreversibility in the hyperactive regime of a biophysical model of brain dynamics, showing that our framework is both theoretically principled and empirically useful. This work challenges the view of the arrow of time as a monolithic entity, enhancing both our theoretical understanding of irreversibility and our ability to detect it in practical applications.