Leyla Roksan Caglar, Pedro A. M. Mediano, Baihan Lin
Humans and modern vision models can reach similar classification accuracy while making systematically different kinds of mistakes - differing not in how often they err, but in who gets mistaken for whom, and in which direction. We show that these directional confusions reveal distinct inductive biases that are invisible to accuracy alone. Using matched human and deep vision model responses on a natural-image categorization task under 12 perturbation types, we quantify asymmetry in confusion matrices and link it to generalization geometry through a Rate-Distortion (RD) framework, summarized by three geometric signatures (slope (beta), curvature (kappa)) and efficiency (AUC). We find that humans exhibit broad but weak asymmetries, whereas deep vision models show sparser, stronger directional collapses. Robustness training reduces global asymmetry but fails to recover the human-like breadth-strength profile of graded similarity. Mechanistic simulations further show that different asymmetry organizations shift the RD frontier in opposite directions, even when matched for performance. Together, these results position directional confusions and RD geometry as compact, interpretable signatures of inductive bias under distribution shift.
Eghbal A. Hosseini, Brian Cheung, Evelina Fedorenko, Alex H. Williams
Apr 23, 2026·q-bio.NC·PDF Neural networks exhibit a remarkable degree of representational convergence across diverse architectures, training objectives, and even data modalities. This convergence is predictive of alignment with brain representation. A recent hypothesis suggests this arises from learning the underlying structure in the environment in similar ways. However, it is unclear how individual stimuli elicit convergent representations across networks. An image can be perceived in multiple ways and expressed differently using words. Here, we introduce a methodology based on the Generalized Procrustes Algorithm to measure intra-modal representational convergence at the single-stimulus level. We applied this to vision models with distinct training objectives, selecting stimuli based on their degree of alignment (intra-modal dispersion). Crucially, we found that this intra-modal dispersion strongly modulates alignment between vision and language models (cross-modal convergence). Specifically, stimuli with low intra-modal dispersion (high agreement among vision models) elicited significantly higher cross-modal alignment than those with high dispersion, by up to a factor of two (e.g., in pairings of DINOv2 with language models). This effect was robust to stimulus selection criteria and generalized across different pairings of vision and language models. Measuring convergence at the single-stimulus level provides a path toward understanding the sources of convergence and divergence across modalities, and between neural networks and human neural representations.
Gustavo G. Cambrainha, Daniel M. Castro, Leonardo L. Gollo, Pedro V. Carelli, Mauro Copelli
Apr 23, 2026·q-bio.NC·PDF The hierarchical organization of the brain is a fundamental structural principle, while brain criticality is a leading hypothesis for its collective dynamics. However, the connection between structure and signatures of criticality remains an open question. Here, we address this issue by applying phenomenological renormalization group approaches to large-scale neuronal spiking activity from the mouse visual cortex and hippocampus. We find that signatures of criticality are not uniform, but instead vary systematically along the known anatomical hierarchy in both brain systems. Strikingly, the direction along this gradient is inconsistent across different criticality exponents, revealing a nontrivial, measure-dependent organization: exponents based on static properties point to a gradient in one direction, while the exponent based on dynamic properties points in the opposite direction. Moreover, the signatures across the visual system are strongly modulated by the engagement in a visual task. We show that the correlations among criticality markers of different brain regions during active engagement are sufficient to reconstruct the anatomical hierarchy from the dynamics. Scaling exponents closely follow a theoretically predicted scaling relation among them, and covary with the hierarchical position. Our findings provide a direct link between the collective dynamics of neurons and the macroscopic architecture of the brain.
Larissa Höfling, Matthias Tangemann, Lotta Piefke, Susanne Keller, Katrin Franke, Matthias Bethge
Apr 23, 2026·q-bio.NC·PDF Neuroscientists and computer vision researchers use model-brain alignment benchmarks to compare artificial and biological vision systems. These benchmarks rank models according to alignment measures such as the similarity of representational geometry or the predictability of neural responses from model activations. However, recent works have identified a number of problems with these rankings, among them their lack of discriminative power and robustness, raising the conceptual question of what it means for a model to be brain-aligned. Here we introduce alignment patterns -- characteristic functional relationship profiles of each brain region to all others -- and propose that models should reproduce these patterns to qualify as brain-aligned. First, we apply a standard benchmarking pipeline to a broad spectrum of vision models of the BOLD Moments video fMRI dataset across visual regions of interest (ROIs). We find diverse models appear equivalent in their brain alignment, reflecting the lack of discriminative power of conventional alignment benchmarking pipelines. In contrast, alignment pattern analysis (APA) is a second-order structural consistency test: a model aligned to a given ROI should reproduce that ROI's characteristic cross-region alignment profile. Applying APA, we find that, while these patterns are highly stable across brains of different subjects, even top-ranked models often fail to capture them. Finally, we argue for a clearer distinction between the criteria a model must meet to serve as a tool versus as a computational model for human visual cortex. Conventional alignment measures may be sufficient for identifying neurally predictive models, but claims about computational or algorithmic similarity may require a stronger basis of evidence, including the reproducibility of relational alignment patterns.
Naga VS Raviteja Chappa, Evangelos Sariyanidi, Lisa Yankowitz, Gokul Nair, Casey J. Zampella, Robert T. Schultz, Birkan Tunç
Micro-actions are subtle, localized movements lasting 1-3 seconds such as scratching one's head or tapping fingers. Such subtle actions are essential for social communication, ubiquitously used in natural interactions, and thus critical for fine-grained video understanding, yet remain poorly understood by current computer vision systems. We identify a fundamental challenge: micro-actions exhibit diverse spatio-temporal characteristics where some are defined by spatial configurations while others manifest through temporal dynamics. Existing methods that commit to a single spatio-temporal decomposition cannot accommodate this diversity. We propose a dual-path network that processes anatomically-grounded spatial entities through parallel Spatial-Temporal (ST) and Temporal-Spatial (TS) pathways. The ST path captures spatial configurations before modeling temporal dynamics, while the TS path inverts this order to prioritize temporal dynamics. Rather than fixed fusion, we introduce entity-level adaptive routing where each body part learns its optimal processing preference, complemented by Mutual Action Consistency (MAC) loss that enforces cross-path coherence. Extensive experiments demonstrate competitive performance on MA-52 dataset and state-of-the-art results on iMiGUE dataset. Our work reveals that architectural adaptation to the inherent complexity of micro-actions is essential for advancing fine-grained video understanding.
Guanghui Cai, Zhen-Ye Huang, Weikang Wang, Hai-Jun Zhou
Apr 22, 2026·q-bio.NC·PDF Lateral predictive coding (LPC) is a simple theoretical framework to appreciate feature detection in biological neural circuits. Recent theoretical work [Huang et al., Phys.Rev.E 112, 034304 (2025)] has successfully constructed optimal LPC networks capable of extracting non-Gaussian hidden input features by imposing the tradeoff between energetic cost and information robustness, but the resulting dynamical systems of recurrent interactions can be very slow in responding to external inputs. We investigate response-time reduction in the present paper. We find that the characteristic response time of the LPC system can be minimized to closely approaching the lower-bound value without compromising the mean predictive error (energetic cost) and the information robustness of signal transmission. We further demonstrate that optimal LPC networks taking a modular structural organization with extensively reduced number of lateral interactions are equally excellent as all-to-all completely connected networks, in terms of feature detection performance, response time, energetic cost and information robustness.
Gastón Avetta, Jose Lobera, Juan José Zárate, Inés Samengo, Damián G. Hernández
Apr 21, 2026·q-bio.NC·PDF Perceptual judgments of sequential stimuli are systematically biased by prior expectations and by the temporal structure of sensory input. In haptic discrimination tasks, these effects often manifest as time-order asymmetries, whereby the perceived difference between two stimuli depends on their presentation order. Here, we introduce a dynamical Bayesian model that accounts for these biases by combining noisy sensory measurements with an evolving internal representation of stimulus intensity. The model formalizes perception as an inference process in which prior expectations are updated by incoming stimuli and propagate in time between observations. We test the model on psychophysical data from vibrotactile discrimination experiments, in which participants compare pairs of sequential stimuli with varying intensities. With a small number of parameters, the model quantitatively reproduces both the direction and magnitude of time-order effects across subjects, as well as the observed inter-individual variability. The inferred parameters provide a compact description of perceptual biases in terms of prior expectations and noise characteristics. Beyond fitting the data, the model induces a transformation of stimulus space, leading to a subject-dependent geometry of perceived stimuli. In this transformed space, perceptual judgments exhibit approximate symmetries that are absent in the physical stimulus coordinates. These results suggest that temporal biases in perception can be understood as a consequence of dynamical inference, and that they impose non-trivial geometric constraints on perceptual representations.
Konstantin F. Willeke, Polina Turishcheva, Alex Gilbert, Goirik Chakrabarty, Hasan A. Bedel, Paul G. Fahey, Yongrong Qiu, Marissa A. Weis, Michaela Vystrčilová, Taliah Muhammad, Lydia Ntanavara, Rachel E. Froebe, Kayla Ponder, Zheng Huan Tan, Emin Orhan, Erick Cobos, Sophia Sanborn, Katrin Franke, Fabian H. Sinz, Alexander S. Ecker, Andreas S. Tolias
Apr 20, 2026·q-bio.NC·PDF Scaling data and artificial neural networks has transformed AI, driving breakthroughs in language and vision. Whether similar principles apply to modeling brain activity remains unclear. Here we leveraged a dataset of 3.1 million neurons from the visual cortex of 73 mice across 323 sessions, totaling more than 150 billion neural tokens recorded during natural movies, images and parametric stimuli, and behavior. We train multi-modal, multi-task models that support three regimes flexibly at test time: neural prediction, behavioral decoding, neural forecasting, or any combination of the three. OmniMouse achieves state-of-the-art performance, outperforming specialized baselines across nearly all evaluation regimes. We find that performance scales reliably with more data, but gains from increasing model size saturate. This inverts the standard AI scaling story: in language and computer vision, massive datasets make parameter scaling the primary driver of progress, whereas in brain modeling -- even in the mouse visual cortex, a relatively simple system -- models remain data-limited despite vast recordings. The observation of systematic scaling raises the possibility of phase transitions in neural modeling, where larger and richer datasets might unlock qualitatively new capabilities, paralleling the emergent properties seen in large language models. Code available at https://github.com/enigma-brain/omnimouse.
Beatrice Caon, Mattia Corti, Francesca Bonizzoni, Paola F. Antonietti
Alzheimer's disease is the most common neurodegenerative disorder. Its pathological development is connected with the misfolding and accumulation of two toxic proteins: amyloid-beta and tau proteins. Mathematical models provide a valuable quantitative tool for monitoring disease progression. In this work, we proposed and compare a novel framework where the spatio-temporal dynamics of amyloid-beta and tau proteins is modeled based on employing either three-dimensional patient-specific geometries or through reduced network-based models defined on the brain connectome. More specifically, a high-fidelity biophysical model is proposed on three-dimensional brain geometries reconstructed from magnetic resonance imaging, whereas a network-based reduced formulation is defined on the brain connectome. For both approaches, a suitable numerical discretisation is proposed. A sensitivity analysis is presented to quantify the influence of model parameters on protein concentration patterns as well as compare the quality of the predictions. For both approaches, the results are validated against PET-SUVR clinical data using 18FAZD4694 for amyloid-beta and 18FMK6240 for tau protein. The results indicate that the three-dimensional model provides the most accurate and biologically consistent description of the disease progression, but remains computationally demanding. On the other hand, the reduced graph-based model is cheaper, but it is not always able to achieve reliable results.
Victoria Bosch, Rowan Sommers, Adrien Doerig, Tim C Kietzmann
Apr 20, 2026·q-bio.NC·PDF Recent studies reveal striking representational alignment between artificial neural networks (ANNs) and biological brains, leading to proposals that all sufficiently capable systems converge on universal representations of reality. Here, we argue that this claim of Universality is premature. We introduce the Umwelt Representation Hypothesis (URH), proposing that alignment arises not from convergence toward a single global optimum, but from overlap in ecological constraints under which systems develop. We review empirical evidence showing that representational differences between species, individuals, and ANNs are systematic and adaptive, which is difficult to reconcile with Universality. Finally, we reframe ANN model comparison as a method for mapping clusters of alignment in ecological constraint space rather than searching for a single optimal world model.
Masanari Asano, Andrei Khrennikov
Apr 19, 2026·q-bio.NC·PDF This paper starts with surveying the evolution of quantum-like models of cognition and decision making, transitioning from static kinematic representations to a robust dynamical framework based on open quantum systems. We provide a comprehensive analysis of the Gorini-Kossakowski-Sudarshan-Lindblad (GKSL) master equation's application in cognitive psychology and decision making, illustrating how it models mental state evolution as a dissipative process influenced by an informational environment. We categorize dynamical regimes into Passive and Active Hamiltonians, demonstrating how non-commutation with projections on decision basis serves as a mathematical signature of cognitive agency and Quantum Escape from classical equilibria. The utility of this framework is further explored through its ability to stabilize non-Nash outcomes in strategic games, such as the Prisoner's Dilemma. Building upon this dynamical foundation, we identify ``cognitive beats'' as a signature of the internal struggle between competing ``flows of mind'' deliberated at approximately equal frequencies. Distinct from the damped oscillations of simple interference, these beats emerge from a structural tension between Liouvillian channels that generates a secondary, slow-scale modulation of conviction. This beat envelope dictates the timing of peak readiness and hesitation, providing a mathematical map of the transition between conflicting cognitive states. By resolving these nested time scales, we provide a new spectral diagnostic for the depth of cognitive agency and the complexity of the underlying deliberation process. This paper develops a theoretical framework linking GKSL dynamics with quantum-like cognition and decision-making (QCDM), highlighting how dissipative quantum models can capture features of human thought and decision processes.
Paul M. Thompson
How much data is enough to make a scientific discovery? As biomedical datasets scale to millions of samples and AI models grow in capacity, progress increasingly depends on predicting when additional data will substantially improve performance. In practice, model development often relies on empirical scaling curves measured across architectures, modalities, and dataset sizes, with limited theoretical guidance on when performance should improve, saturate, or exhibit cross-over behavior. We propose a scaling-law framework for cross-modal discoverability based on spectral structure of data covariance operators, task-aligned signal projections, and learned representations. Many performance metrics, including AUC, can be expressed in terms of cumulative signal-to-noise energy accumulated across identifiable spectral modes of an encoder and cross-modal operator. Under mild assumptions, this accumulation follows a zeta-like scaling law governed by power-law decay of covariance spectra and aligned signal energy, leading naturally to the appearance of the Riemann zeta function. Representation learning methods such as sparse models, low-rank embeddings, and multimodal contrastive objectives improve sample efficiency by concentrating useful signal into earlier stable modes, effectively steepening spectral decay and shifting scaling curves. The framework predicts cross-over regimes in which simpler models perform best at small sample sizes, while higher-capacity or multimodal encoders outperform them once sufficient data stabilizes additional degrees of freedom. Applications include multimodal disease classification, imaging genetics, functional MRI, and topological data analysis. The resulting zeta law provides a principled way to anticipate when scaling data, improving representations, or adding modalities is most likely to accelerate discovery.
Moo K. Chung, Luigi Maccotta, Aaron Struck
Apr 19, 2026·q-bio.NC·PDF Cortical folding reflects coordinated neurodevelopmental processes and provides a sensitive marker of neurological disease. In juvenile myoclonic epilepsy (JME), structural abnormalities are subtle and spatially distributed, limiting the sensitivity of conventional morphometric measures such as cortical thickness. We introduce a Poisson flow model derived from gradients of the mean curvature field on the cortical surface. The method yields a smooth scalar field obtained from a Poisson equation, whose surface gradient defines a flow representation of folding organization. This representation enables spatially coherent characterization of sulcal--gyral patterns and provides a principled geometric framework for studying distributed cortical alterations in JME.
Anthony Zador, Jean-Marc Fellous, Terrence Sejnowski, Gina Adam, James B Aimone, Akwasi Akwaboah, Yiannis Aloimonos, Carmen Amo Alonso, Chiara Bartolozzi, Michael J. Bennington, Michael Berry, Bing W. Brunton, Gert Cauwenberghs, Hillel J. Chiel, Tobi Delbruck, John Doyle, Jason Eshraghian, Ralph Etienne-Cummings, Cornelia Fermuller, Matthew Jacobsen, Ali A. Minai, Barbara Oakley, Alexander G. Ororbia, Joe Paton, Blake Richards, Yulia Sandamirskaya, Abhronil Sengupta, Shihab Shamma, Michael P. Stryker, Seong Jong Yoo, Steven W. Zucker
Apr 19, 2026·q-bio.NC·PDF Neuroscience and Artificial Intelligence (AI) have made impressive progress in recent years but remain only loosely interconnected. Based on a workshop convened by the National Science Foundation in August 2025, we identify three fundamental capability gaps in current AI: the inability to interact with the physical world, inadequate learning that produces brittle systems, and unsustainable energy and data inefficiency. We describe the neuroscience principles that address each: co-design of body and controller, prediction through interaction, multi-scale learning with neuromodulatory control, hierarchical distributed architectures, and sparse event-driven computation. We present a research roadmap organized around these principles at near, mid, and long-term horizons. We argue that realizing this program requires a new generation of researchers trained across the boundary between neuroscience and engineering, and describe the institutional conditions: interdisciplinary training, hardware access, community standards, and ethics, needed to support them. We conclude that NeuroAI, neuroscience-informed artificial intelligence, has the potential to overcome limitations of current AI while deepening our understanding of biological neural computation.
Moo K. Chung, D. Vijay Anand, Anass B El-Yaagoubi, Jae-Hun Jung, Anqi Qiu, Hernando Ombao
Apr 18, 2026·q-bio.NC·PDF Classical causal models, such as Granger causality and structural equation modeling, are largely restricted to acyclic interactions and struggle to represent cyclic and higher-order dynamics in complex networks. We introduce a causal framework grounded in a variational principle, interpreting causality as directional energy flow from high- to low-energy states along network connections. Using Hodge theory, network flows are decomposed into dissipative components and a persistent harmonic component that captures stable cyclic interactions. Applied to resting-state fMRI connectivity, our variational framework reveals robust cyclic causal patterns that are not detected by conventional causal models, highlighting the value of variational principles for causality.
Nils Leutenegger
A central question in computational neuroscience is whether the learning rule used to train a neural network determines how well its internal representations align with those of the human visual cortex. We present a systematic comparison of four learning rules -- backpropagation (BP), feedback alignment (FA), predictive coding (PC), and spike-timing-dependent plasticity (STDP) -- applied to identical convolutional architectures and evaluated against human fMRI data from the THINGS-fMRI dataset (720 stimuli, 3 subjects) using Representational Similarity Analysis (RSA). Crucially, we include an untrained random-weights baseline that reveals the dominant role of architecture. We find that early visual alignment (V1/V2) is primarily architecture-driven: an untrained CNN achieves rho = 0.071, statistically indistinguishable from BP (rho = 0.072, p = 0.43). Learning rules only differentiate at higher visual areas: BP dominates at LOC/IT, and PC with local Hebbian updates achieves IT alignment statistically indistinguishable from BP (p = 0.18). FA consistently impairs representations below the random baseline at V1. Partial RSA confirms all effects survive pixel-similarity control. These results demonstrate that the relationship between learning rules and cortical alignment is region-specific: architecture determines early alignment, while supervised objectives drive late alignment.
William Retnaraj, Simone Betteti, Alexander Davydov, Francesco Bullo, Jorge Cortes
Linear-threshold networks (LTNs) capture the mesoscale behavior of interacting populations of neurons and are of particular interest to control theorists due to their dynamical richness and relative ease of analysis. The aim of this paper is to advance the study of global asymptotic stability in LTNs with asymmetric neural interactions and heterogeneous dissipation under the structural Lyapunov diagonal stability (LDS) condition. To this end, we introduce a one-parameter family of LTNs that preserves the LDS condition and has a parameter-independent equilibrium set. In the fast limit, this family converges to a projected dynamical system (PDS), while in the slow limit, it converges to a discontinuous hard-selector system (HSS). Under LDS, we prove that the fast PDS limit is globally exponentially stable and that the HSS limit is globally asymptotically stable. This alignment suggests that the limiting systems capture essential mechanisms governing stability across the entire LTN family. Together with numerical evidence, these findings indicate that resolving stability at the fast and slow endpoints provides a promising and structurally grounded path toward establishing global stability for LTNs with biologically plausible recurrence and diagonal dissipation.
Qianchen Gong, Yingpeng Liu, Yan Zhang, Muhua Zheng, Kesheng Xu
Apr 17, 2026·q-bio.NC·PDF Experimental evidence indicates that intracellular chloride concentration regulates the excitation and inhibition (EI) balance, yet the mechanisms by which activity-dependent chloride dynamics drive seizure evolution and stage transitions remain unclear. We present a conductance-based neuronal network in which EI balance emerges from chloride homeostasis via channel-mediated influx and transporter-mediated extrusion. We show that the fraction of inhibitory synaptic conductance contributing to channel-mediated influx acts as a control parameter that organizes seizure dynamics into distinct stages,pre-ictal, ictal-tonic, and ictal-clonic,distinguished by characteristic amplitude and frequency signatures. Decreasing this fraction shortens ictal activity and suppresses seizure initiation, whereas high fraction promotes the emergence of ictal-tonic and ictal-clonic stages and spiral-wave dynamics, rendering seizure dynamics largely insensitive to inhibition. At intermediate values, seizures bypass the ictal-tonic stage and emerge directly as the icta,clonic stage. Moreover, joint variation of fractions with synaptic strengths reveals that recurrent excitation expands the tonic-clonic seizure, while recurrent inhibition prolongs pre-ictal states and suppresses ictal-clonic activity.
Christophe Pallier, Julie Bonnaire, Marie-France Fourcade
Apr 16, 2026·q-bio.NC·PDF We introduce `Goxpyriment', a new open-source software framework for programming behavioral and cognitive experiments using the Go programming language. The library is designed to address some limitations of existing Python-based experiment tools, particularly the runtime environment complexity that frequently complicates deployment across laboratories. Because Go is a compiled language that can natively embed assets (e.g., graphics, audio files, and stimulus lists), Goxpyriment compiles entire experiments into single, self-contained executable binaries with zero runtime dependencies. This drastically simplifies distribution to collaborators and testing computers. The programming interface, inspired by Expyriment (Krause & Lindemann, 2014), was designed to be human friendly. The library includes an array of visual stimuli (text, shapes, images, Gabor patches, motion clouds, ...) and audio capabilities (WAV playback and tone generation). While developing Goxpyriment, we focused on timing reliability. Input events are timestamped by the operating system at hardware-interrupt time, so reaction times are computed by subtracting two OS-level timestamps rather than relying on continuous polling. Go's garbage collector can be disabled, greatly reducing the probability of unpredictable pauses that could corrupt stimulus timing. Finally, a set of over forty psychology experiments implemented in Goxpyriment are provided that promote not only learning by humans but also improve the ability of modern AI-assisted coding tools to help program experiments. The framework is released under the GNU General Public License v3 and is freely available at https://github.com/chrplr/goxpyriment.
Giovanni M. Di Liberto
Apr 16, 2026·q-bio.NC·PDF Encoding models enable measurement of how our brains represent sensory inputs using electro-and magneto-encephalography (MEEG). Evaluating how closely encoding models reflect the underlying brain functions is a crucial premise for model interpretation and hypothesis testing. However, the ground-truth neural activity is unknown, preventing model evaluation with respect to the target neural signal. Existing evaluation metrics must therefore relate model's predictions to noisy MEEG measurements, where most variance is stimulus-unrelated. Here, I introduce an evaluation framework where model predictions are compared to a ground-truth approximation, obtained by aligning MEEG signals with predictions using canonical correlation analysis and via participant averaging. The resulting metric (CPA-PA) yields single-participant evaluations outperforming conventional scores by 300-1000% on synthetic EEG data and 250% on 34 real MEEG datasets (818 datapoints). These gains reflect increased sensitivity to stimulus-relevant neural activity and reduced dependence on SNR, establishing ground-truth approximation as a robust framework for evaluating encoding models.