Sahani Pathiraja, Peter Jan van Leeuwen
Model uncertainty quantification is an essential component of effective data assimilation. Model errors associated with sub-grid scale processes are often represented through stochastic parameterizations of the unresolved process. Many existing Stochastic Parameterization schemes are only applicable when knowledge of the true sub-grid scale process or full observations of the coarse scale process are available, which is typically not the case in real applications. We present a methodology for estimating the statistics of sub-grid scale processes for the more realistic case that only partial observations of the coarse scale process are available. Model error realizations are estimated over a training period by minimizing their conditional sum of squared deviations given some informative covariates (e.g. state of the system), constrained by available observations and assuming that the observation errors are smaller than the model errors. From these realizations a conditional probability distribution of additive model errors given these covariates is obtained, allowing for complex non-Gaussian error structures. Random draws from this density are then used in actual ensemble data assimilation experiments. We demonstrate the efficacy of the approach through numerical experiments with the multi-scale Lorenz 96 system using both small and large time scale separations between slow (coarse scale) and fast (fine scale) variables. The resulting error estimates and forecasts obtained with this new method are superior to those from two existing methods.
Francesca Romana Crucinio, Sahani Pathiraja
Wasserstein-Fisher-Rao (WFR) gradient flows have been recently proposed as a powerful sampling tool that combines the advantages of pure Wasserstein (W) and pure Fisher-Rao (FR) gradient flows. Existing algorithmic developments implicitly make use of operator splitting techniques to numerically approximate the WFR partial differential equation, whereby the W flow is evaluated over a given step size and then the FR flow (or vice versa). This works investigates the impact of the order in which the W and FR operator are evaluated and aims to provide a quantitative analysis. Somewhat surprisingly, we show that with a judicious choice of step size and operator ordering, the split scheme can converge to the target distribution faster than the exact WFR flow (in terms of model time). We obtain variational formulae describing the evolution over one time step of both splitting schemes and investigate in which settings the W-FR split should be preferred to the FR-W split. As a step towards this goal we show that the WFR gradient flow preserves log-concavity and obtain the first sharp decay bound for WFR flow.
Sahani Pathiraja, Sebastian Reich, Wilhelm Stannat
Various particle filters have been proposed over the last couple of decades with the common feature that the update step is governed by a type of control law. This feature makes them an attractive alternative to traditional sequential Monte Carlo which scales poorly with the state dimension due to weight degeneracy. This article proposes a unifying framework that allows to systematically derive the McKean-Vlasov representations of these filters for the discrete time and continuous time observation case, taking inspiration from the smooth approximation of the data considered in Crisan & Xiong (2010) and Clark & Crisan (2005). We consider three filters that have been proposed in the literature and use this framework to derive Itô representations of their limiting forms as the approximation parameter $δ\rightarrow 0$. All filters require the solution of a Poisson equation defined on $\mathbb{R}^{d}$, for which existence and uniqueness of solutions can be a non-trivial issue. We additionally establish conditions on the signal-observation system that ensures well-posedness of the weighted Poisson equation arising in one of the filters.
Sahani Pathiraja, Philipp Wacker
Dec 11, 2024·q-bio.PE·PDF The replicator-mutator equation is a model for populations of individuals carrying different traits, with a fitness function mediating their ability to replicate, and a stochastic model for mutation. We derive analytical solutions for the replicator-mutator equation in continuous time and for continuous traits for a quadratic fitness function. Using these results we can explain and quantify (without the need for numerical in-silico simulations) a series of evolutionary phenomena, in particular the flying kite effect, survival of the flattest, and the ability of a population to sustain itself while tracking an optimal feature which may be fixed, moving with bounded velocity in trait space, oscillating, or randomly fluctuating.
Sahani Pathiraja, Philipp Wacker
It has long been posited that there is a connection between the dynamical equations describing evolutionary processes in biology and sequential Bayesian learning methods. This manuscript describes new research in which this precise connection is rigorously established in the continuous time setting. Here we focus on a partial differential equation known as the Kushner-Stratonovich equation describing the evolution of the posterior density in time. Of particular importance is a piecewise smooth approximation of the observation path from which the discrete time filtering equations, which are shown to converge to a Stratonovich interpretation of the Kushner-Stratonovich equation. This smooth formulation will then be used to draw precise connections between nonlinear stochastic filtering and replicator-mutator dynamics. Additionally, gradient flow formulations will be investigated as well as a form of replicator-mutator dynamics which is shown to be beneficial for the misspecified model filtering problem. It is hoped this work will spur further research into exchanges between sequential learning and evolutionary biology and to inspire new algorithms in filtering and sampling.
Sahani Pathiraja, Wilhelm Stannat
Control-type particle filters have been receiving increasing attention over the last decade as a means of obtaining sample based approximations to the sequential Bayesian filtering problem in the nonlinear setting. Here we analyse one such type, namely the feedback particle filter and a recently proposed approximation of the associated gain function based on diffusion maps. The key purpose is to provide analytic insights on the form of the approximate gain, which are of interest in their own right. These are then used to establish a roadmap to obtaining well-posedness and convergence of the finite $N$ system to its mean field limit. A number of possible future research directions are also discussed.
Sahani Pathiraja
The aim of this paper is to obtain convergence in mean in the uniform topology of piecewise linear approximations of Stochastic Differential Equations (SDEs) with $C^1$ drift and $C^2$ diffusion coefficients with uniformly bounded derivatives. Convergence analyses for such Wong-Zakai approximations most often assume that the coefficients of the SDE are uniformly bounded. Almost sure convergence in the unbounded case can be obtained using now standard rough path techniques, although $L^q$ convergence appears yet to be established and is of importance for several applications involving Monte-Carlo approximations. We consider $L^2$ convergence in the unbounded case using a combination of traditional stochastic analysis and rough path techniques. We expect our proof technique extend to more general piecewise smooth approximations.
Sahani Pathiraja, Claudia Schillings, Philipp Wacker
Sequential filtering and spatial inverse problems assimilate data points distributed either temporally (in the case of filtering) or spatially (in the case of spatial inverse problems). Sometimes it is possible to choose the position of these data points (which we call sensors here) in advance, with the goal of maximising the expected information gain (or a different metric of performance) from future data, and this leads to an Optimal Experimental Design (OED) problem. Here we revisit an interpretation of optimising sensor placement as an integration with respect to a general probability measure $ξ$. This generalises the problem of discrete-time sensor placement (which corresponds to the special case where the probability measure is a mixture of Diracs) to an infinite-dimensional, but mathematically more well-behaved setting. We focus on the continuous-time stochastic filtering setting, whose solution is governed by the Zakai equation. We derive an expression for the Fréchet derivative of a general OED utility functional, the key to which is an adjoint (backwards in time) differential equation. This paves the way for utilising new gradient-based methods for solving the corresponding optimisation problem, as a potentially more efficient alternative to (semi-)discrete optimisation methods, e.g. based on greedy insertion and deletion of sensor placements.
Francesca R. Crucinio, Sahani Pathiraja
We consider the problem of sampling from a probability distribution $π$. It is well known that this can be written as an optimisation problem over the space of probability distribution in which we aim to minimise the Kullback--Leibler divergence from $π$. We consider several partial differential equations (PDEs) whose solution is a minimiser of the Kullback--Leibler divergence from $π$ and connect them to well-known Monte Carlo algorithms. We focus in particular on PDEs obtained by considering the Wasserstein--Fisher--Rao geometry over the space of probabilities and show that these lead to a natural implementation using importance sampling and sequential Monte Carlo. We propose a novel algorithm to approximate the Wasserstein--Fisher--Rao flow of the Kullback--Leibler divergence and conduct an extensive empirical study to identify when these algorithms outperforms other popular Monte Carlo algorithms.
Francesca Romana Crucinio, Sahani Pathiraja
We consider the problem of sampling from a probability distribution $π$. It is well known that this can be written as an optimisation problem over the space of probability distributions in which we aim to minimise the Kullback--Leibler divergence from $π$. We consider the effect of replacing $π$ with a sequence of moving targets $(π_t)_{t\ge0}$ defined via geometric tempering on the Wasserstein and Fisher--Rao gradient flows. We show that convergence occurs exponentially in continuous time, providing novel bounds in both cases. We also consider popular time discretisations and explore their convergence properties. We show that in the Fisher--Rao case, replacing the target distribution with a geometric mixture of initial and target distribution never leads to a convergence speed up both in continuous time and in discrete time. Finally, we explore the gradient flow structure of tempered dynamics and derive novel adaptive tempering schedules.
Sahani Pathiraja, Sebastian Reich
In this paper, we exploit the gradient flow structure of continuous-time formulations of Bayesian inference in terms of their numerical time-stepping. We focus on two particular examples, namely, the continuous-time ensemble Kalman-Bucy filter and a particle discretisation of the Fokker-Planck equation associated to Brownian dynamics. Both formulations can lead to stiff differential equations which require special numerical methods for their efficient numerical implementation. We compare discrete gradient methods to alternative semi-implicit and other iterative implementations of the underlying Bayesian inference problems.
Soumick Chatterjee, Franziska Gaidzik, Alessandro Sciarra, Hendrik Mattern, Gábor Janiga, Oliver Speck, Andreas Nürnberger, Sahani Pathiraja
In the domain of medical imaging, many supervised learning based methods for segmentation face several challenges such as high variability in annotations from multiple experts, paucity of labelled data and class imbalanced datasets. These issues may result in segmentations that lack the requisite precision for clinical analysis and can be misleadingly overconfident without associated uncertainty quantification. This work proposes the PULASki method as a computationally efficient generative tool for biomedical image segmentation that accurately captures variability in expert annotations, even in small datasets. This approach makes use of an improved loss function based on statistical distances in a conditional variational autoencoder structure (Probabilistic UNet), which improves learning of the conditional decoder compared to the standard cross-entropy particularly in class imbalanced problems. The proposed method was analysed for two structurally different segmentation tasks (intracranial vessel and multiple sclerosis (MS) lesion) and compare our results to four well-established baselines in terms of quantitative metrics and qualitative output. These experiments involve class-imbalanced datasets characterised by challenging features, including suboptimal signal-to-noise ratios and high ambiguity. Empirical results demonstrate the PULASKi method outperforms all baselines at the 5\% significance level. Our experiments are also of the first to present a comparative study of the computationally feasible segmentation of complex geometries using 3D patches and the traditional use of 2D slices. The generated segmentations are shown to be much more anatomically plausible than in the 2D case, particularly for the vessel task.
Jana de Wiljes, Sahani Pathiraja, Sebastian Reich
Several numerical tools designed to overcome the challenges of smoothing in a nonlinear and non-Gaussian setting are investigated for a class of particle smoothers. The considered family of smoothers is induced by the class of linear ensemble transform filters which contains classical filters such as the stochastic ensemble Kalman filter, the ensemble square root filter and the recently introduced nonlinear ensemble transform filter. Further the ensemble transform particle smoother is introduced and particularly highlighted as it is consistent in the particle limit and does not require assumptions with respect to the family of the posterior distribution. The linear update pattern of the considered class of linear ensemble transform smoothers allows one to implement important supplementary techniques such as adaptive spread corrections, hybrid formulations, and localization in order to facilitate their application to complex estimation problems. These additional features are derived and numerically investigated for a sequence of increasingly challenging test problems.
Adrian N. Bishop, Pierre Del Moral, Sahani D. Pathiraja
We analyse various perturbations and projections of Kalman-Bucy semigroups and Riccati equations. For example, covariance inflation-type perturbations and localisation methods (projections) are common in the ensemble Kalman filtering literature. In the limit of these ensemble methods, the regularised sample covariance tends toward a solution of a perturbed/projected Riccati equation. With this motivation, results are given characterising the error between the nominal and regularised Riccati flows and Kalman-Bucy filtering distributions. New projection-type models are also discussed; e.g. Bose-Mesner projections. These regularisation models are also of interest on their own, and in, e.g., differential games, control of stochastic/jump processes, and robust control.