cs.NE — arXiv2

Showing 1–20 of 17,522 results

Neuromorphic Computing Based on Parametrically-Driven Oscillators and Frequency Combs

Mahadev Sunil Kumar, Adarsh Ganesan

Apr 23, 2026·cs.NE·PDF

Parametrically driven oscillators provide a natural platform for neuromorphic computation, where nonlinear mode coupling and intrinsic dynamics enable both memory and high-dimensional transformation. Here, we investigate a two-mode system exhibiting 2:1 parametric resonance and demonstrate its operation as a reservoir computer across distinct dynamical regimes, including sub-threshold, parametric resonance, and frequency-comb states. By encoding input signals into the drive amplitude and sampling the resulting temporal and spectral responses, we perform one step-ahead prediction of benchmark chaotic systems, including Mackey-Glass, Rossler, and Lorenz dynamics. We find that optimal computational performance is achieved within the parametric resonance regime, where nonlinear interactions are activated while temporal coherence is preserved. In contrast, although frequency-comb states introduce increased spectral dimensionality, their performance is not consistently good across their existence band and also degrades in the chaotic comb regime due to loss of phase coherence. Mapping prediction error over parameter space reveals a direct correspondence between computational capability and the underlying bifurcation structure, with low-error regions aligned with the parametric resonance boundary. We further show that the input modulation, the detuning from the frequency matching condition, damping ratio, and input data rate systematically control the accessible dynamical regimes and thereby the computational performance. These results establish parametric resonance as a robust operating regime for oscillator-based reservoir computing and provide design principles for tuning physical systems toward optimal neuromorphic functionality.

Geometric Monomial (GEM): a family of rational 2N-differentiable activation functions

Eylon E. Krause

Apr 23, 2026·cs.LG·PDF

The choice of activation function plays a crucial role in the optimization and performance of deep neural networks. While the Rectified Linear Unit (ReLU) remains the dominant choice due to its simplicity and effectiveness, its lack of smoothness may hinder gradient-based optimization in deep architectures. In this work we propose a family of $C^{2N}$-smooth activation functions whose gate follows a log-logistic CDF, achieving ReLU-like performance with purely rational arithmetic. We introduce three variants: GEM (the base family), E-GEM (an $ε$-parameterized generalization enabling arbitrary $L^p$-approximation of ReLU), and SE-GEM (a piecewise variant eliminating dead neurons with $C^{2N}$ junction smoothness). An $N$-ablation study establishes $N=1$ as optimal for standard-depth networks, reducing the GELU deficit on CIFAR-100 + ResNet-56 from 6.10% to 2.12%. The smoothness parameter $N$ further reveals a CNN-transformer tradeoff: $N=1$ is preferred for deep CNNs, while $N=2$ is preferred for transformers. On MNIST, E-GEM ties the best baseline (99.23%). On CIFAR-10 + ResNet-56, SE-GEM ($ε=10^{-4}$) surpasses GELU (92.51% vs 92.44%) -- the first GEM-family activation to outperform GELU. On CIFAR-100 + ResNet-56, E-GEM reduces the GELU deficit from 6.10% (GEM $N=2$) to just 0.62%. On GPT-2 (124M), GEM achieves the lowest perplexity (72.57 vs 73.76 for GELU), with GEM $N=1$ also beating GELU (73.32). On BERT-small, E-GEM ($ε=10$) achieves the best validation loss (6.656) across all activations. The $ε$-parameterization reveals a scale-dependent optimum: small $ε$ ($10^{-4}$--$10^{-6}$) for deep CNNs and larger transformers, with the special case of small transformers (BERT-small) benefiting from large $ε$ ($ε=10$) due to its limited depth and unconstrained gradients.

On the Role of Preprocessing and Memristor Dynamics in Reservoir Computing for Image Classification

Rishona Daniels, Duna Wattad, Ronny Ronen, David Saad, Shahar Kvatinsky

Apr 23, 2026·cs.NE·PDF

Reservoir computing (RC) is an emerging recurrent neural network architecture that has attracted growing attention for its low training cost and modest hardware requirements. Memristor-based circuits are particularly promising for RC, as their intrinsic dynamics can reduce network size and parameter overhead in tasks such as time-series prediction and image recognition. Although RC has been demonstrated with several memristive devices, a comprehensive evaluation of device-level requirements remains limited. In this paper, we analyze and explain the operation of a parallel delayed feedback network (PDFN) RC architecture with volatile memristors, focusing on how device characteristics -- such as decay rate, quantization, and variability -- affect reservoir performance. We further discuss strategies to improve data representation in the reservoir using preprocessing methods and suggest potential improvements. The proposed approach achieves 95.89% classification accuracy on MNIST, comparable with the best reported memristor-based RC implementations. Furthermore, the method maintains high robustness under 20% device variability, achieving an accuracy of up to 94.2%. These results demonstrate that volatile memristors can support reliable spatio-temporal information processing and reinforce their potential as key building blocks for compact, high-speed, and energy-efficient neuromorphic computing systems.

Novelty-Based Generation of Continuous Landscapes with Diverse Local Optima Networks

Kippei Mizuta, Shoichiro Tanaka, Shuhei Tanaka, Toshiharu Hatanaka

Apr 23, 2026·cs.NE·PDF

Local Optima Networks (LONs) represent the global structure of search spaces as graphs, but their construction requires iterative execution of a search algorithm to find local optima and approximate transitions between Basins of Attraction (BoAs). In continuous optimization, this high computational cost prevents systematic investigation of the relationship between LON features and evolutionary algorithm performance. To address this issue, we propose an alternative definition of BoAs for Max-Set of Gaussians (MSG) landscapes with explicitly tunable multimodality. This bypasses search-based BoA identification, enabling low-cost LON construction. Moreover, we leverage Novelty Search (NS) to explore the parameter space of the MSG landscape generator, producing instances with diverse graph topologies. Our experiments show that the proposed BoAs closely align with gradient-based BoAs, and that NS successfully generates instances with varied search difficulty and connectivity patterns among optima. Finally, over the instances generated by NS, we predict the success rate of two well-established evolutionary algorithms from LON features. While our LON construction is specific to MSG landscapes, the proposed framework provides a dataset that serves as a foundation for landscape-aware optimization.

Trust-SSL: Additive-Residual Selective Invariance for Robust Aerial Self-Supervised Learning

Wadii Boulila, Adel Ammar, Bilel Benjdira, Maha Driss

Apr 23, 2026·cs.CV·PDF

Self-supervised learning (SSL) is a standard approach for representation learning in aerial imagery. Existing methods enforce invariance between augmented views, which works well when augmentations preserve semantic content. However, aerial images are frequently degraded by haze, motion blur, rain, and occlusion that remove critical evidence. Enforcing alignment between a clean and a severely degraded view can introduce spurious structure into the latent space. This study proposes a training strategy and architectural modification to enhance SSL robustness to such corruptions. It introduces a per-sample, per-factor trust weight into the alignment objective, combined with the base contrastive loss as an additive residual. A stop-gradient is applied to the trust weight instead of a multiplicative gate. While a multiplicative gate is a natural choice, experiments show it impairs the backbone, whereas our additive-residual approach improves it. Using a 200-epoch protocol on a 210,000-image corpus, the method achieves the highest mean linear-probe accuracy among six backbones on EuroSAT, AID, and NWPU-RESISC45 (90.20% compared to 88.46% for SimCLR and 89.82% for VICReg). It yields the largest improvements under severe information-erasing corruptions on EuroSAT (+19.9 points on haze at s=5 over SimCLR). The method also demonstrates consistent gains of +1 to +3 points in Mahalanobis AUROC on a zero-shot cross-domain stress test using BDD100K weather splits. Two ablations (scalar uncertainty and cosine gate) indicate the additive-residual formulation is the primary source of these improvements. An evidential variant using Dempster-Shafer fusion introduces interpretable signals of conflict and ignorance. These findings offer a concrete design principle for uncertainty-aware SSL. Code is publicly available at https://github.com/WadiiBoulila/trust-ssl.

CO$_2$ sequestration hybrid solver using isogeometric alternating-directions and collocation-based robust variational physics informed neural networks (IGA-ADS-CRVPINN)

Askold Vilkha, Tomasz Służalec, Marcin Łoś, Maciej Paszyński

Apr 22, 2026·math.NA·PDF

This paper presents the hybrid solver for a $CO_2$ sequestration problem. The solver uses the IGA-ADS (IsoGeometric Analysis Alternating Directions solver) to compute the saturation scalar field update using the explicit method, and CRVPINN (Collocation-based Robust Variational Physics Informed Neural Networks solver) to compute the pressure scalar field. The study focuses on simulating the physical behavior of $CO_2$ in porous structures, excluding chemical reactions. The mathematical model is based on Darcy's Law. The CRVPINN is pretrained on the initial pressure configuration, and the time step pressure updates require only 100 iterations of the Adam method per time step. We compare our hybrid IGA-ADS solver, coupled with the CRVPINN method, with a baseline of the IGA-ADS solver coupled with the MUMPS direct solver. Our hybrid solver is over 3 times faster on a single computational node from the ARES cluster of ACK CYFRONET. Future work includes extensive testing, inverse problem solving, and potential application to $H_2$ storage problems.

Learning Hippo: Multi-attractor Dynamics and Stability Effects in a Biologically Detailed CA3 Extension of Hopfield Networks

Daniele Corradetti, Renato Corradetti

Apr 22, 2026·cs.NE·PDF

We present a biologically detailed extension of the classical Hopfield/Marr auto-associative memory model for CA3, implementing ten populations (two asymmetric pyramidal subtypes, eight GABAergic interneuron classes), forty-seven compartments, multi-rule plasticity (recurrent Hebb, BCM anti-saturation, mossy-fiber short-term, endocannabinoid iLTD, burst-gated Hebb), and a bimodal cholinergic encoding/consolidation cycle. Evaluated on pattern completion across auto-associative, associative, and temporal regimes, and on a controlled inhibitory-proportion manipulation at $N{=}256$, the full architecture exhibits \emph{three qualitative signatures absent from a minimal Hopfield baseline}: (i)~multi-attractor cross-seed behaviour at $K{=}5$ with biologically realistic inhibitory proportions, where two of five seeds converge to positive attractors with margin ${+}0.10{-}0.22$ (Cohen's $d{=}0.71$, one-sided $p{=}0.08$); (ii)~target-selective associative recall in paired $(A, B)$ memory at $K{\geq}5$, where the full model retrieves $B$ from a partial cue of $A$ while the minimal model echoes $A$ (Pearson margin $Δ{=}{+}0.163$ at $K{=}5$); (iii)~reduced cross-seed variance of the full model below the minimal baseline under clean upstream, with ratios $1.0{-}3.0$. These three signatures are architecture-specific: they appear consistently across independent regimes and are absent from the minimal control.

An explicit operator explains end-to-end computation in the modern neural networks used for sequence and language modeling

Anif N. Shikder, Ramit Dey, Sayantan Auddy, Luisa Liboni, Alexandra N. Busch, Arthur Powanwe, Ján Mináč, Roberto C. Budzinski, Lyle E. Muller

Apr 22, 2026·cs.NE·PDF

We establish a mathematical correspondence between state space models, a state-of-the-art architecture for capturing long-range dependencies in data, and an exactly solvable nonlinear oscillator network. As a specific example of this general correspondence, we analyze the diagonal linear time-invariant implementation of the Structured State Space Sequence model (S4). The correspondence embeds S4D, a specific implementation of S4, into a ring network topology, in which recent inputs are encoded, as waves of activity traveling over the one-dimensional spatial layout of the network. We then derive an exact operator expression for the full forward pass of S4D, yielding an analytical characterization of its complete input-output map. This expression reveals that the nonlinear decoder in the system induces interactions between these information-carrying waves that enable classifying real-world sequences. These results generalize across modern SSM architectures, and show that they admit an exact mathematical description with a clear physical interpretation. These insights enable a new level of interpretability for these systems in terms of nonlinear oscillator networks.

Response time of lateral predictive coding and benefits of modular structures

Guanghui Cai, Zhen-Ye Huang, Weikang Wang, Hai-Jun Zhou

Apr 22, 2026·q-bio.NC·PDF

Lateral predictive coding (LPC) is a simple theoretical framework to appreciate feature detection in biological neural circuits. Recent theoretical work [Huang et al., Phys.Rev.E 112, 034304 (2025)] has successfully constructed optimal LPC networks capable of extracting non-Gaussian hidden input features by imposing the tradeoff between energetic cost and information robustness, but the resulting dynamical systems of recurrent interactions can be very slow in responding to external inputs. We investigate response-time reduction in the present paper. We find that the characteristic response time of the LPC system can be minimized to closely approaching the lower-bound value without compromising the mean predictive error (energetic cost) and the information robustness of signal transmission. We further demonstrate that optimal LPC networks taking a modular structural organization with extensively reduced number of lateral interactions are equally excellent as all-to-all completely connected networks, in terms of feature detection performance, response time, energetic cost and information robustness.

Distributional Value Estimation Without Target Networks for Robust Quality-Diversity

Behrad Koohy, Jamie Bayne

Apr 22, 2026·cs.LG·PDF

Quality-Diversity (QD) algorithms excel at discovering diverse repertoires of skills, but are hindered by poor sample efficiency and often require tens of millions of environment steps to solve complex locomotion tasks. Recent advances in Reinforcement Learning (RL) have shown that high Update-to-Data (UTD) ratios accelerate Actor-Critic learning. While effective, standard high-UTD algorithms typically utilise target networks to stabilise training. This requirement introduces a significant computational bottleneck, rendering them impractical for resource-intensive Quality-Diversity (QD) tasks where sample efficiency and rapid population adaptation are critical. In this paper, we introduce QDHUAC, a sample-efficient, target-free and distributional QD-RL algorithm that provides dense and low-variance gradient signals, which enables high-UTD training for Dominated Novelty Search whilst requiring an order of magnitude fewer environment steps. We demonstrate that our method enables stable training at high UTD ratios, achieving competitive coverage and fitness on high-dimensional Brax environments with an order of magnitude fewer samples than baselines. Our results suggest that combining target-free distributional critics with dominance-based selection is a key enabler for the next generation of sample-efficient evolutionary RL algorithms.

Neuro-evolutionary stochastic architectures in gauge-covariant neural fields

Rodrigo Carmo Terin

Apr 22, 2026·cs.NE·PDF

We extend our gauge-covariant stochastic neural-field framework by promoting architecture-level parameters to slow stochastic variables evolving in function space. Our effective theory is formulated in terms of classical commuting fields and provides symmetry-constrained diagnostics of marginality and finite-width effects through the maximal Lyapunov exponent, the amplification factor, and dressed spectral kernels. On top of this dynamics, we introduce a Markovian evolutionary scheme compatible with the local $U(1)$ structure of the effective model. By using a minimal implementation, the genotype is reduced to the weight-variance parameter $σ_w^2$, and the fitness functional combines spectral agreement, marginal stability, and a symmetry-constrained critical anchor. Comparing three evolutionary models, we find that only the fully symmetry-constrained Ginibre $U(1)$ version robustly approaches a narrow near-marginal regime and reproduces the predicted low-frequency finite-width spectral behavior. These results support the use of symmetry-guided effective stability diagnostics as practical principles for stochastic architecture search in controlled settings.

Quantization robustness from dense representations of sparse functions in high-capacity kernel associative memory

Akira Tamamori

Apr 22, 2026·cs.NE·PDF

High-capacity associative memories based on Kernel Logistic Regression (KLR) are known for their exceptional performance but are hindered by high computational costs. This paper investigates the compressibility of KLR-trained Hopfield networks to understand the geometric principles of its robust encoding. We provide a comprehensive geometric theory based on spontaneous symmetry breaking and Walsh analysis, and validate it with compression experiments (quantization and pruning). Our experiments reveal a striking contrast: the network is extremely robust to low-precision quantization but highly sensitive to pruning. Our theory explains this via a ``sparse function, dense representation'' principle, where a sparse input mapping is implemented with a dense, bimodal parameterization. Our findings not only provide a practical path to hardware-efficient kernel memories but also offer new insights into the geometric principles of robust representation in neural systems.

What Makes an LLM a Good Optimizer? A Trajectory Analysis of LLM-Guided Evolutionary Search

Xinhao Zhang, Xi Chen, François Portet, Maxime Peyrard

Apr 21, 2026·cs.CL·PDF

Recent work has demonstrated the promise of orchestrating large language models (LLMs) within evolutionary and agentic optimization systems. However, the mechanisms driving these optimization gains remain poorly understood. In this work, we present a large-scale study of LLM-guided evolutionary search, collecting optimization trajectories for 15 LLMs across 8 tasks. Although zero-shot problem-solving ability correlates with final optimization outcomes, it explains only part of the variance: models with similar initial capability often induce dramatically different search trajectories and outcomes. By analyzing these trajectories, we find that strong LLM optimizers behave as local refiners, producing frequent incremental improvements while progressively localizing the search in semantic space. Conversely, weaker optimizers exhibit large semantic drift, with sporadic breakthroughs followed by stagnation. Notably, various measures of solution novelty do not predict final performance; novelty is beneficial only when the search remains sufficiently localized around high-performing regions of the solution space. Our results highlight the importance of trajectory analysis for understanding and improving LLM-based optimization systems and provide actionable insights for their design and training.

Scalable Memristive-Friendly Reservoir Computing for Time Series Classification

Coşku Can Horuz, Andrea Ceni, Claudio Gallicchio, Sebastian Otte

Apr 21, 2026·cs.NE·PDF

Memristive devices present a promising foundation for next-generation information processing by combining memory and computation within a single physical substrate. This unique characteristic enables efficient, fast, and adaptive computing, particularly well suited for deep learning applications. Among recent developments, the memristive-friendly echo state network (MF-ESN) has emerged as a promising approach that combines memristive-inspired dynamics with the training simplicity of reservoir computing, where only the readout layer is learned. Building on this framework, we propose memristive-friendly parallelized reservoirs (MARS), a simplified yet more effective architecture that enables efficient scalable parallel computation and deeper model composition through novel subtractive skip connections. This design yields two key advantages: substantial training speedups of up to 21x over the inherently lightweight echo state network baseline and significantly improved predictive performance. Moreover, MARS demonstrates what is possible with parallel memristive-friendly reservoir computing: on several long sequence benchmarks our compact gradient-free models substantially outperform strong gradient-based sequence models such as LRU, S5, and Mamba, while reducing full training time from minutes or hours down seconds or even only a few hundred milliseconds. Our work positions parallel memristive-friendly computing as a promising route towards scalable neuromorphic learning systems that combine high predictive capability with radically improved computational efficiency, while providing a clear pathway to energy-efficient, low-latency implementations on emerging memristive and in-memory hardware.

Large Language Models Exhibit Normative Conformity

Mikako Bito, Keita Nishimoto, Kimitaka Asatani, Ichiro Sakata

Apr 21, 2026·cs.AI·PDF

The conformity bias exhibited by large language models (LLMs) can pose a significant challenge to decision-making in LLM-based multi-agent systems (LLM-MAS). While many prior studies have treated "conformity" simply as a matter of opinion change, this study introduces the social psychological distinction between informational conformity and normative conformity in order to understand LLM conformity at the mechanism level. Specifically, we design new tasks to distinguish between informational conformity, in which participants in a discussion are motivated to make accurate judgments, and normative conformity, in which participants are motivated to avoid conflict or gain acceptance within a group. We then conduct experiments based on these task settings. The experimental results show that, among the six LLMs evaluated, up to five exhibited tendencies toward not only informational conformity but also normative conformity. Furthermore, intriguingly, we demonstrate that by manipulating subtle aspects of the social context, it may be possible to control the target toward which a particular LLM directs its normative conformity. These findings suggest that decision-making in LLM-MAS may be vulnerable to manipulation by a small number of malicious users. In addition, through analysis of internal vectors associated with informational and normative conformity, we suggest that although both behaviors appear externally as the same form of "conformity," they may in fact be driven by distinct internal mechanisms. Taken together, these results may serve as an initial milestone toward understanding how "norms" are implemented in LLMs and how they influence group dynamics.

Neutrally Evolving Interlocking Complexity in the Quandary Den

Andrew Walsh

Apr 20, 2026·cs.NE·PDF

Molecular biology features numerous complexes of proteins that coordinate in an interlocking fashion to fulfill different functions. Adaptive evolution explains some of this complexity, but needn't be the default when neutral explanations suffice. A new artificial life model ``organism,'' the Quandary Den, is introduced to explore different neutral evolution scenarios where complexity increases in the absence of greater informational needs. Two interlocking complexity scenarios emerge. Subfunctionalization leads to functionality diffusing through the complex. Masking allows intracomplex interference to accumulate genetically, requiring that it be blocked at the level of expression.

Similarity-based Portfolio Construction for Black-box Optimization

Catalin-Viorel Dinu, Diederick Vermetten, Carola Doerr

Apr 20, 2026·cs.NE·PDF

In black-box optimization, a central question is which algorithm to use to solve a given, previously unseen, problem. Selecting a single algorithm, however, entails inherent risks: inaccuracies in the selector may lead to poor choices, and even well-performing algorithms with high variance can yield unsatisfactory results in a single run. A natural remedy is to split the evaluation budget across multiple runs of potentially different algorithms. Such sequential algorithm portfolios benefit from variance reduction and complementarities between algorithms, often outperforming approaches that allocate the entire budget to a single solver. While effective portfolios can be constructed post-hoc, transferring this idea to the algorithm selection setting is non-trivial. We show that a naive portfolio constructed over the full training set already outperforms the strongest traditional baseline, the virtual best solver. We then propose a simple yet effective k-nearest-neighbor-based finetuning approach to construct portfolios tailored to unseen instances, yielding further improvements and highlighting the effectiveness of portfolio selection in fixed-budget black-box optimization.

The Magnitude of Dominated Sets: A Pareto Compliant Indicator Grounded in Metric Geometry

Michael T. M. Emmerich

Apr 20, 2026·math.OC·PDF

We investigate \emph{magnitude} as a new unary and strictly Pareto-compliant quality indicator for finite approximation sets to the Pareto front in multiobjective optimization. Magnitude originates in enriched category theory and metric geometry, where it is a notion of size or point content for compact metric spaces and a generalization of cardinality. For dominated regions in the $\ell_1$ box setting, magnitude is close to hypervolume but not identical: it contains the top-dimensional hypervolume term together with positive lower-dimensional projection and boundary contributions. This paper gives a first theoretical study of magnitude as an indicator. We consider multiobjective maximization with a common anchor point. For dominated sets generated by finite approximation sets, we derive an all-dimensional projection formula, prove weak and strict set monotonicity on finite unions of anchored boxes, and thereby obtain weak and strict Pareto compliance. Unlike hypervolume, magnitude assigns positive value to boundary points sharing one or more coordinates with the anchor point, even when their top-dimensional hypervolume contribution vanishes. We then formulate projected set-gradient methods and compare hypervolume and magnitude on biobjective and three-dimensional simplex examples. Numerically, magnitude favors boundary-including populations and, for suitable cardinalities, complete Das--Dennis grids, whereas hypervolume prefers more interior-filling configurations. Computationally, magnitude reduces to hypervolume on coordinate projections; for fixed dimension this yields the same asymptotic complexity up to a factor $2^d-1$, and in dimensions two and three $Θ(n\log n)$ time. These results identify magnitude as a mathematically natural and computationally viable alternative to hypervolume for finite Pareto front approximations.

On Scalability of Multi-Objective Evolutionary Algorithms on Combinatorial Optimisation Problems

Menghao Tang, Zimin Liang, Miqing Li

Apr 20, 2026·cs.NE·PDF

Scalability of evolutionary algorithms refers to assessing how their performance changes as problem size increases. In the area of multi-objective optimisation, research on the scalability of multi-objective evolutionary algorithms (MOEAs) has predominantly focussed on continuous problems. However, multi-objective combinatorial optimisation problems (MOCOPs) differ from continuous ones. Their discrete and rigid structure often brings rugged landscape, numerous local optimal solutions and disjoint global optimal regions. This leads to different behaviour of MOEAs. For example, SEMO, a simple MOEA without mating selection and diversity maintenance mechanisms, has been shown to be highly competitive, and in many cases to outperform more sophisticated MOEAs on MOCOPs. Yet, it remains unclear whether such findings hold for large-scale cases. In this paper, we conduct an empirical investigation into the scalability of MOEAs on combinatorial problems, with problem size from 50 to 5,000. Our results show that SEMO experiences a decline in convergence speed as dimensionality increases, compared to other MOEAs such as NSGA-II, SMS-EMOA and MOEA/D. We further demonstrate that the absence of crossover is a major contributor to SEMO's underperformance in large-scale problems, and that incorporating crossover into SEMO can substantially accelerate convergence in general, despite being detrimental in spreading solutions over the Pareto front.

On the Generalization Bounds of Symbolic Regression with Genetic Programming

Masahiro Nomura, Ryoki Hamano, Isao Ono

Apr 19, 2026·cs.LG·PDF

Symbolic regression (SR) with genetic programming (GP) aims to discover interpretable mathematical expressions directly from data. Despite its strong empirical success, the theoretical understanding of why GP-based SR generalizes beyond the training data remains limited. In this work, we provide a learning-theoretic analysis of SR models represented as expression trees. We derive a generalization bound for GP-style SR under constraints on tree size, depth, and learnable constants. Our result decomposes the generalization gap into two interpretable components: a structure-selection term, reflecting the combinatorial complexity of choosing an expression-tree structure, and a constant-fitting term, capturing the complexity of optimizing numerical constants within a fixed structure. This decomposition provides a theoretical perspective on several widely used practices in GP, including parsimony pressure, depth limits, numerically stable operators, and interval arithmetic. In particular, our analysis shows how structural restrictions reduce hypothesis-class growth while stability mechanisms control the sensitivity of predictions to parameter perturbations. By linking these practical design choices to explicit complexity terms in the generalization bound, our work offers a principled explanation for commonly observed empirical behaviors in GP-based SR and contributes towards a more rigorous understanding of its generalization properties.

cs.NE — arXiv2

Showing 1–20 of 17,522 results

CO$_2$ sequestration hybrid solver using isogeometric alternating-directions and collocation-based robust variational physics informed neural networks (IGA-ADS-CRVPINN)

Askold Vilkha, Tomasz Służalec, Marcin Łoś, Maciej Paszyński

Apr 22, 2026·math.NA·PDF

Learning Hippo: Multi-attractor Dynamics and Stability Effects in a Biologically Detailed CA3 Extension of Hopfield Networks

Daniele Corradetti, Renato Corradetti

Apr 22, 2026·cs.NE·PDF

An explicit operator explains end-to-end computation in the modern neural networks used for sequence and language modeling

Anif N. Shikder, Ramit Dey, Sayantan Auddy, Luisa Liboni, Alexandra N. Busch, Arthur Powanwe, Ján Mináč, Roberto C. Budzinski, Lyle E. Muller

Apr 22, 2026·cs.NE·PDF