Di Wu, Ling Liang, Haizhao Yang
Bayesian Optimal Experimental Design (BOED) provides a rigorous framework for decision-making tasks in which data acquisition is often the critical bottleneck, especially in resource-constrained settings. Traditionally, BOED typically selects designs by maximizing expected information gain (EIG), commonly defined through the Kullback-Leibler (KL) divergence. However, classical evaluation of EIG often involves challenging nested expectations, and even advanced variational methods leave the underlying log-density-ratio objective unchanged. As a result, support mismatch, tail underestimation, and rare-event sensitivity remain intrinsic concerns for KL-based BOED. To address these fundamental bottlenecks, we introduce an IPM-based BOED framework that replaces density-based divergences with integral probability metrics (IPMs), including the Wasserstein distance, Maximum Mean Discrepancy, and Energy Distance, resulting in a highly flexible plug-and-play BOED framework. We establish theoretical guarantees showing that IPM-based utilities provide stronger geometry-aware stability under surrogate-model error and prior misspecification than classical EIG-based utilities. We also validate the proposed framework empirically, demonstrating that IPM-based designs yield highly concentrated credible sets. Furthermore, by extending the same sample-based BOED template in a plug-and-play manner to geometry-aware discrepancies beyond the IPM class, illustrated by a neural optimal transport estimator, we achieve accurate optimal designs in high-dimensional settings where conventional nested Monte Carlo estimators and advanced variational methods fail.
František Bartoš, Eric-Jan Wagenmakers, Maarten Marsman, Don van den Bergh
Bayes factor sensitivity analysis examines how the evidence for one hypothesis over another depends on the prior distribution. In complex models, the standard approach refits the model at each hyper-parameter value, and the total computational cost scales linearly in the grid size. We propose a method that recovers the entire sensitivity curve from a single additional model fit. The key identity decomposes the Bayes factor at any hyper-parameter value $γ_x$ into an ``anchor'' Bayes factor at a fixed reference $γ_0$ and a Savage--Dickey density ratio in an extended model that places a hyper-prior on $γ$. Once this extended model is fit, the Bayes factor at any $γ_x$ follows from the anchor value and a ratio of two posterior density ordinates. To approximate this ratio, we employ the importance-weighted marginal density estimator (IWMDE). Because the sensitivity parameter enters the model only through the prior distribution on the model parameters, the data likelihood cancels in the IWMDE, reducing it to a simple ratio of prior density evaluations on the MCMC draws, without any additional likelihood computation. The resulting estimator is fast, remains accurate even with small MCMC samples, and substantially outperforms kernel density estimation across the full sensitivity range. The method extends naturally to simultaneous sensitivity over multiple hyper-parameters and to Bayesian model averaging. We illustrate it on a univariate Bayesian $t$-test with exact Bayes factors for validation, a bivariate informed $t$-test, and a Bayesian model-averaged meta-analysis, obtaining accurate sensitivity curves at a fraction of the brute-force cost.
Utku Erdogan, Gabriel J. Lord, Joaquin Miguez
Particle filters (PFs) are recursive Monte Carlo algorithms for Bayesian tracking and prediction in state space models. This paper addresses continuous-discrete filtering problems, where the hidden state evolves as an Itô stochastic differential equation (SDE) and observations arrive at discrete times. We propose a novel class of constrained PFs that enforce compact support on the state at each observation instant, thereby limiting exploration to plausible regions of the state space. Unlike earlier approaches that truncate the likelihood, the proposed method constrains the dynamics directly, yielding improved numerical stability. Under standard regularity assumptions, we prove convergence of the constrained filter, derive uniform-in-time error estimates, and extend the analysis to account for discretisation errors arising from numerical SDE solvers. A numerical study on a stochastic Lorenz-96 system demonstrates the practical application of the methodology when the constraint is implemented via barrier functions.
Lena Zellinger, Antonio Vergari
When approximating an intractable density via variational inference (VI) the variational family is typically chosen as a simple parametric family that very likely does not contain the target. This raises the question: Under which conditions can we recover characteristics of the target despite misspecification? In this work, we extend previous results on robust VI with location-scale families under target symmetries. We derive sufficient conditions guaranteeing exact recovery of the mean when using the forward Kullback-Leibler divergence and $α$-divergences. We further show how and why optimization can fail to recover the target mean in the absence of our sufficient conditions, providing initial guidelines on the choice of the variational family and $α$-value.
Leo L Duan
Variable selection in linear regression has been a central topic in statistical research for decades. Bayesian variable selection methods, which account for uncertainty in both the regression coefficients and the noise variance, have achieved broad success through the use of discrete or continuous shrinkage priors and efficient collapsed Gibbs samplers. Despite their popularity and strong empirical performance, an enigma remains: the marginal likelihood, obtained by integrating out the regression coefficients and noise variance, is not log-concave; therefore, there is no guarantee of reliably finding its global optimum. In this article, we study this problem from an optimization perspective. Taking the negative log-marginal likelihood as a loss function of the latent precision parameters, we can rewrite it as a difference of convex functions (DC), and then optimize it via a simple iterative algorithm. Under mild compact set conditions, the DC algorithm converges to the global optimum at a linear rate. The positive finding applies to type-II maximum likelihood and extends to maximum marginal posterior under suitable priors, indicating that the problem of mode finding in Bayesian variable selection is much more benign than the lack of log-concavity might suggest. Besides the theoretical insight, the proposed algorithm is easy to implement, free of tuning, and extensible to structured sparsity, and thus can serve as an efficient alternative or warm-start for traditional Markov chain Monte Carlo solutions. The method is illustrated through numerical studies and a spatial data application for quantifying the aftershock risk following the 2019 Ridgecrest earthquakes. The source code for the algorithm is publicly available at https://github.com/leoduan/dca_optimization_variable_selection.
Matteo Amestoy, Mark A. van de Wiel, Wessel N. van Wieringen
ProfileGLMM is an R package integrating Generalised Linear Mixed Models (GLMMs) as the outcome model for Bayesian profile regression. This statistical framework simultaneously i) explains the variation in the outcome and ii) clusters the observations based on a specified set of interdependent clustering covariates. The derived cluster memberships are then incorporated, alongside others, as explanatory variables in the regression to model the outcome. This framework efficiently handles complex, highly correlated covariate structures whose direct inclusion in a standard regression model would be statistically sub-optimal. ProfileGLMM significantly extends Bayesian profile regression's scope by resolving two key constraints of previous implementations: 1) it allows the analysis of hierarchical and longitudinal data structures through the inclusion of random effects, and 2) it enables the study of interactions between latent clusters and other observable covariates. ProfileGLMM accommodates various data types, supporting both continuous or binary outcomes and both categorical and continuous clustering covariates. Built on fast Rcpp code with minimal mandatory parameters, ProfileGLMM offers a flexible analytical tool. It significantly enhances the utility of profile regression for researchers in fields such as epidemiology, social sciences, and clinical studies dealing with complex data.
Surya Ratna Prakash D, Soumyendu Raha
Transient instability in nonlinear stochastic dynamical systems is a fundamental limitation in safety-critical aerospace applications, particularly during powered descent and landing where failure is driven by finite-time excursions rather than asymptotic divergence. Classical notions of mean-square or asymptotic stability are therefore insufficient for certification and design. This paper develops a logarithmic-norm-based framework for finite-time transient stability analysis of nonlinear Ito stochastic differential equations. The approach extends matrix measures to nonlinear mappings in a Lipschitz sense, enabling efficient characterization of instantaneous perturbation growth without local linearization. Using Ito calculus, bounds on the mean and variance of transient growth are derived, providing conditions for non-positive finite-time mean growth and probabilistic bounds on instability events. The analysis highlights a key distinction between mean and sample-path behavior, showing that stability in expectation does not guarantee pathwise finite-time safety, and that almost-sure transient stability cannot generally be ensured under stochastic diffusion. The framework is extended to data-constrained stochastic dynamics in navigation and estimation, revealing a trade-off between estimation consistency and transient robustness due to continuous data injection. Demonstrations with flight-like lunar lander telemetry show that similar mean trajectories can exhibit significantly different transient stability behaviour, and that mission failure correlates with accumulation of transient instability over short critical intervals. These results motivate probabilistic finite-time stability metrics for safety-critical autonomous systems.
Francesca Romana Crucinio, Sahani Pathiraja
We consider the problem of sampling from a probability distribution $π$. It is well known that this can be written as an optimisation problem over the space of probability distributions in which we aim to minimise the Kullback--Leibler divergence from $π$. We consider the effect of replacing $π$ with a sequence of moving targets $(π_t)_{t\ge0}$ defined via geometric tempering on the Wasserstein and Fisher--Rao gradient flows. We show that convergence occurs exponentially in continuous time, providing novel bounds in both cases. We also consider popular time discretisations and explore their convergence properties. We show that in the Fisher--Rao case, replacing the target distribution with a geometric mixture of initial and target distribution never leads to a convergence speed up both in continuous time and in discrete time. Finally, we explore the gradient flow structure of tempered dynamics and derive novel adaptive tempering schedules.
Hanwen Huang
We propose Annealed Langevin Monte Carlo for Flow ODE Sampling (ALMC-ODE), a method for generating samples from unnormalized target distributions, with a particular emphasis on multimodal densities that are challenging for standard Markov chain Monte Carlo methods. ALMC-ODE is based on a probability-flow ordinary differential equation (ODE) derived from stochastic interpolants, which continuously transports a standard Gaussian reference distribution at $t = 0$ to the target distribution $ρ$ at $t = 1$. The key innovation lies in an annealed Langevin Markov chain that evolves through a sequence of intermediate distributions bridging the reference and the target. The resulting importance-weighted particles, reweighted via a Jarzynski-based scheme, yield a low-variance estimator of the velocity field governing the ODE. On the theoretical side, we establish a Jarzynski-type reweighting identity for general time-inhomogeneous transition kernels, characterize the optimal backward kernel that minimizes the variance of the importance weights, and prove an $\mathcal{O}(1/n)$ mean squared error bound for the resulting velocity-field estimator. Numerical experiments on challenging benchmarks, including Gaussian mixture models and a 64-dimensional Allen--Cahn field system, demonstrate that ALMC-ODE significantly outperforms both direct Monte Carlo ODE approaches and Hamiltonian Monte Carlo when applied to highly multimodal target distributions.
Yan Zhang
This work makes two advances in the study of the (approximate) nonparametric maximum likelihood estimator (NPMLE) for exponential family mixture models. First, we develop a data-compression strategy that reduces the cost of repeated likelihood evaluations in NPMLE computation to logarithmic order in the sample size. Second, we show that, for a broad class of approximate NPMLEs, the resulting marginal density estimator attains an almost parametric rate of convergence.
Mehmet Sıddık Çadırcı, Yener Ünal
The multivariate generalised Gaussian distribution (MGGD) is commonly used to model high-dimensional vectors with non-Gaussian radial behaviour, ranging from sharp-peaked to heavy-tailed profiles. However, because many classical multivariate tests are based on covariance inversion or high-dimensional density estimation, formal goodness-of-fit assessment for MGGD models remains challenging in modern regimes where the dimension is comparable to or exceeds the sample size. We introduce an affine-invariant, fully non-parametric goodness-of-fit procedure based on the nearest neighbour (NN) graph topology and the adapted zero principle. Following robust standardisation, we construct an independent reference sample from the adapted standardised MGGD and measure, on the combined NN graph, the cross-edge count to assess how well the observed and reference point clouds exhibit the mixture behaviour anticipated by the model. Calibration performed using a refitted parametric bootstrap accounts for nuisance-parameter uncertainty, thus ensuring reliable size under a composite specification. In this paper, we establish asymptotic validity under high-dimensional scaling and demonstrate consistency with respect to fixed elliptical departures, providing a geometric interpretation based on radial concentration and shell separation. Our simulation studies across a broad spectrum of dimensions and tail shapes reveal accurate Type I error control and robust power relative to heavy- and light-tailed alternatives, thus improving upon energy-distance benchmarks and normality-oriented graphical tests in contexts where MGGD modelling is most applicable.
Simon Pauli, Andreas Futschik
This paper proposes an extension to discrete Phase-Type distributions (DPH) by introducing random rewards. These allow for modeling a system in which a visit to a certain state does not emit a deterministic reward. Instead, the rewards follow either a Bernoulli or a geometric distribution. Utilizing this increased flexibility, we further sketch a possible use case for these random rewards by introducing the Inertia-Escalation model (IEM), a process with latent severity levels characterized through two parameters: Inertia ν and escalation η. We also discuss parameter inference for such models. To validate and explore random rewards and the IEM, we conducted extensive simulations and applied the model to two datasets: historical warfare and the Telco customer churn dataset.
Edgar Jaber, Emmanuel Remy, Vincent Chabridon, Morgane Garo-Sail, Mathilde Mougeot, Didier Lucor, Jerome Delplace, Maxime Lointier
We present a hybrid framework to support prognostics of the clogging degradation phenomenon in tube support plates for digital twins of steam generators in pressurized water reactors. The proposed approach combines a physics-based simulation code, heterogeneous and sparse observational data, and several uncertainty quantification techniques to obtain a robust estimate of the steam generator remaining useful life associated with the clogging rate. The proposed framework is compatible with a digital twin platform to assist maintenance planning of EDF steam generators.
Ben Seiyon Lee, Reetam Majumder, Jordan Richards, Emma S. Simpson, Likun Zhang
Understanding and mapping extreme heat is critical for risk management and public health planning, particularly in regions with complex terrain and heterogeneous climate. We present a case study of extreme heat in the Four Corners region of the United States, using high-resolution surface skin temperature data from the North American Land Data Assimilation System to characterize spatially heterogeneous and seasonally varying extremes across complex terrain, and to assess their implications for heat-related public health risks. Spatial extremes exhibit complex dependencies across geographic regions, which require sophisticated statistical models to capture. While recent advances in spatial extreme value modeling provide flexible representations of joint tail dependencies, statistical inference remains computationally demanding, especially for datasets with a large number of locations. To address this, we propose a random scale mixture process that facilitates Bayesian inference of spatial extremes, and develop scalable inference strategies that leverage advances in spatial modeling and amortized learning. We evaluate the proposed inference methods through large-scale simulation studies, representing the first such extensive study in spatial extremes, and a high-resolution surface skin temperature application in the Four Corners region. Surface skin temperature is particularly useful as a predictor for air temperature, for studying heatwaves and related environmental phenomena, and to calculate heat indices reflecting downstream health risks at any location. Our findings provide insights into efficient, data-driven approaches for modeling spatial extremes, and serve as guidelines for practitioners in the fields of climate science, environmental risk assessment, and beyond.
Joseph D Consiglio
Popular software packages report four generalizations of the ANOVA F test when conducting a multivariate analysis of variance (MANOVA). The reported operating characteristics of these fours tests vary widely depending on which research article the reader chooses. Some studies report extremely high type I error rates for a particular test even under ideal assumptions of multivariate normality and homoskedasticity; other studies report rates near the nominal level despite violations of the model assumptions. This simulation study seeks to clarify this apparent contradiction by providing a systematic evaluation of the type I error rates of the four statistics used to test for a group effect in MANOVA.
Simone Catanzaro, Elvira Di Nardo
The first passage time problem is considered for stochastic logistic growth model with constant harvesting and multiplicative environmental noise. Explicit expressions for the moments and cumulants of both upcrossing and downcrossing FPTs in the presence of constant thresholds are obtained through a power-series expansion of the Laplace transform. Then a closed-form representation of the FPT density is recovered via an orthogonal Laguerre--Gamma expansion . This representation is used to numerically evaluate FPT densities, with the truncation order controlling the trade-off between accuracy and stability. Numerical experiments based on Monte Carlo simulations confirm the high accuracy of the method in regimes of moderate dispersion and highlight its limitations when higher-order moments grow rapidly. Application to fisheries management models shows that the method remains effective even for large-scale population. Finally, the approximated density is satisfactory used to estimate some parameters of the model.
Denis Rustand, Håvard Rue, Lisa Le Gall, Karen Leffondre
Joint models for longitudinal and time-to-event data are increasingly used in health research to characterize the association between biomarker trajectories and the risk of clinical events. However, these models usually assume a linear relationship between the longitudinal marker and the log-hazard of the event. This assumption is rarely verified and often fails to capture complex biological mechanisms, such as U-shaped risk profiles or plateau effects. In this paper, we propose a fast and stable hierarchical framework for non-linear association structures in joint models using Integrated Nested Laplace Approximations (INLA), implemented in the INLAjoint R package. Our approach builds upon a unified framework where the scaling effect of the marker is decomposed into a parametric baseline (constant and linear components) and a data-driven smooth deviation modeled via an orthogonal basis derived from a second-order random walk. This natural hierarchy allows researchers to adapt model flexibility directly and verify the linearity assumption using standard information criteria. Through simulation studies, we demonstrate that the proposed method accurately recovers complex non-linear trajectories. We illustrate the practical utility of our framework by analyzing the joint association of the current value and current slope of body mass index (BMI) with all-cause mortality in the Health and Retirement Study. This analysis reveals a U-shaped mortality risk for the BMI value, and a non-linear effect for the rate of weight change, where a declining weight trajectory is associated with higher mortality risk.
Kijung Jeon, Michael Muehlebach, Molei Tao
Generative modeling within constrained sets is essential for scientific and engineering applications involving physical, geometric, or safety requirements (e.g., molecular generation, robotics). We present a unified framework for constrained diffusion models on generic nonconvex feasible sets $Σ$ that simultaneously enforces equality and inequality constraints throughout the diffusion process. Our framework incorporates both overdamped and underdamped dynamics for forward and backward sampling. A key algorithmic innovation is a computationally efficient landing mechanism that replaces costly and often ill-defined projections onto $Σ$, ensuring feasibility without iterative Newton solves or projection failures. By leveraging underdamped dynamics, we accelerate mixing toward the prior distribution, effectively alleviating the high simulation costs typically associated with constrained diffusion. Empirically, this approach reduces function evaluations and memory usage during both training and inference while preserving sample quality. On benchmarks featuring equality and mixed constraints, our method achieves comparable sample quality to state-of-the-art baselines while significantly reducing computational cost, providing a practical and scalable solution for diffusion on nonconvex feasible sets.
Zinovy Malkin
Apr 18, 2026·astro-ph.IM·PDF Possibilities are considered to simplify the computation of several statistical functions used to test statistical hypotheses when processing observations: the inverse normal distribution, the Student's t-distribution, and the criterion for rejecting outliers. For these three cases, simple approximation expressions are proposed for the quantiles of these statistical distributions, which are accurate enough for most practical applications.
Lena Zellinger, Nicola Branchini, Lennert De Smet, Víctor Elvira, Nikolay Malkin, Antonio Vergari
Classical mixture models (MMs) are widely used tractable proposals for approximate inference settings such as variational inference (VI) and importance sampling (IS). Recently, mixture models with negative coefficients, called subtractive mixture models (SMMs), have been proposed as a potentially more expressive alternative. However, how to effectively use SMMs for VI and IS is still an open question as they do not provide latent variable semantics and therefore cannot use sampling schemes for classical MMs. In this work, we study how to circumvent this issue by designing several expectation estimators for IS and learning schemes for VI with SMMs, and we empirically evaluate them for distribution approximation. Finally, we discuss the additional challenges in estimation stability and learning efficiency that they carry and propose ways to overcome them. Code is available at: https://github.com/april-tools/delta-vi.