Changwang Xiao, Nan Yang, Qingxin Meng
This paper investigates the $H_{2}/H_{\infty}$ control problem for linear stochastic differential systems under partial observation. Unlike existing studies that assume full state accessibility, we consider the scenario where the controller has access only to an observation process. The objective is to design a controller that balances the $H_2$ performance criterion with the $H_\infty$ robustness requirement under worst-case disturbances, formulated as a nonzero-sum differential game. Using the Kalman filtering method, we derive the corresponding optimal filtering equation. Furthermore, a Stochastic Bounded Real Lemma under the partial observation framework is established, providing necessary and sufficient conditions for the $H_\infty$ robustness constraint. We also show the connection between the existence of a Nash equilibrium and the solvability of the cross-coupled Riccati equations, and illustrate the effectiveness of the proposed approach through a numerical example involving an unmanned aerial vehicle (UAV).
Ilir Gusija, Fady Alajaji, Serdar Yüksel
Simultaneous localization and mapping (SLAM) is a foundational state estimation problem in robotics in which a robot accurately constructs a map of its environment while also localizing itself within this construction. We study the active SLAM problem through the lens of optimal stochastic control, thereby recasting it as a decision-making problem under partial information. After reviewing several commonly studied models, we present a general stochastic control formulation of active SLAM together with a rigorous treatment of motion, sensing, and map representation. We introduce a new exploration stage cost that encodes the geometry of the state when evaluating information-gathering actions. This formulation, constructed as a nonstandard partially observable Markov decision process (POMDP), is then analyzed to derive rigorously justified approximate solutions that are near-optimal. To enable this analysis, the associated regularity conditions are studied under general assumptions that apply to a wide range of robotics applications. For a particular case, we conduct an extensive numerical study in which standard learning algorithms are used to learn near-optimal policies.
Juan J. Forero-Hernańdez, Francisco Guillén-González, Élder J. Villamizar-Roa
We propose and analyze an optimal control problem associated with a Keller-Segel type parabolic system with chemoattraction, modeling the glioblastoma growth in a bi-dimensional bounded domain, influenced by the presence of oxygen where the controls are two different (chemotherapy and antiangiogenic) therapies. The model considers the random diffusion of tumor cells and oxygen, the movement of cells towards the oxygen gradient (oxytaxis), and reaction terms describing the interaction between cells and oxygen. We establish a mathematical framework to analyze the existence and uniqueness of weak-strong solution of the model and subsequently we analyze an optimal control problem considering a cost functional that minimizes both the tumor growth and the oxygen concentration. We prove the existence of a global optimal solution and derive necessary first-order optimality conditions. Finally, we propose a methodology for approximating the optimal therapies. We use the gradient of the reduced cost functional through the adjoint scheme, and minimize the cost functional implementing the Adam gradient optimization method. Some numerical experiments are provided to demonstrate the effectiveness of the proposed scheme.
François Delarue, Pierre Lavigne
We introduce a class of robust control problems formulated in min-max form, in which the principal agent is viewed as a central planner facing Nature. The agent's cost is a nonlinear function of all its possible realizations, encompassing in particular the mean field regime where the cost depends on the distribution of the states. In parallel, Nature favors the occurrence of outcomes that are least favorable to the agent, at an entropic cost. We establish existence and uniqueness of solutions under appropriate assumptions, including suitable convexity-concavity conditions, and derive a related stochastic maximum principle. We further address a corresponding class of robust variational mean field games in which the interaction term is subject to ambiguity, and prove existence and uniqueness of solutions.
Huan Li, Zhouchen Lin
In this short note, we establish, for the first time, the convergence rate of SOAP, an efficient and popular matrix-based optimizer for training deep neural networks. Our analysis extends to a more general variant of SOAP that admits arbitrary orthogonal projection matrices and requires only that these matrices be conditionally independent of the current stochastic gradient at each iteration. For example, they may be constructed from information available up to the preceding step.
Marcel Nutz, Chenyang Zhong
We study the vanishing-regularization limit of entropically regularized optimal transport (EOT) for the Euclidean distance cost $c(x,y)=\|x-y\|$ in dimension $d>1$. We develop a comprehensive variational convergence framework that entails two main results. First, we resolve the longstanding entropic selection problem: the EOT minimizer converges to a distinguished optimal transport plan that is characterized explicitly as the solution of a constrained EOT problem on each transport ray. Denoting by $\varepsilon>0$ the regularization parameter, this selection holds for all $o(\varepsilon)$-approximate minimizers, with sharp failure at the $O(\varepsilon)$ scale. Second, we establish an explicit second-order expansion of the entropic transport cost. The second-order term encodes the geometry of the regularization and reveals the optimal asymptotic tradeoff between entropy and transport cost.
Eduardo Casas, Karl Kunisch
This paper is dedicated to the analysis of infinite horizon optimal control problems subject to semilinear parabolic equations with constraints on the controls and discounted cost functionals. The discount factors on the cost and the state components are allowed to differ from each other. First-order as well as second-order optimality conditions are derived and the importance of allowing different discount factors for the second-order analysis for the class of nonlinearities under consideration is demonstrated. Finally convergence and rate of convergence for the approximation of the infinite horizon problem by a family of finite horizon problems is proven.
Tugal Zhanlav, Lkhamsuren Altangerel, Khuder Otgondorj
In this paper, we propose a class of super-schemes for efficiently solving nonlinear unconstrained optimization problems. The proposed approach introduces two novel choices of step-size parameters, leading to efficient descent directions without requiring second-order information. We develop one-step, two-step, and three-step iterative schemes (denoted by SS1, SS2, and SS3) and establish that these methods achieve higher-order convergence of orders two, four, and six, respectively. Despite their high convergence rates, the computational complexity of the proposed methods remains comparable to existing gradient-based methods, with a cost of $\mathcal{O}(n^2)$ per iteration. The proposed methods are simple to implement and do not require complicated line-search procedures. Their effectiveness is demonstrated through extensive numerical experiments on a wide range of problems, including large-scale and ill-conditioned cases. The results show that the proposed methods significantly outperform classical methods, such as the Barzilai-Borwein method and other gradient-based approaches, in terms of iteration count and computational efficiency. Finally, the numerical results are consistent with the theoretical analysis, confirming the stability of the proposed schemes for test optimization problems.
Fernando Mário de Oliveira Filho, Andreas Spomer, Frank Vallentin
We determine putative optimal packings of regular spherical polygons via optimization on smooth manifolds. For several cases, we establish maximality by extending the Lovász theta number to Cayley graphs on the special orthogonal group ${\rm SO}(3)$. To this end, we introduce an algebraic criterion characterizing when congruent regular spherical polygons have disjoint interiors, leading to a unified formulation of the packing constraints. Using harmonic analysis on ${\rm SO}(3)$, we reduce the theta number to a trigonometric sum-of-squares problem, which can be solved via semidefinite programming.
Hideaki Iiduka
The Halpern algorithm is a powerful fixed point approximation method for finding the closest point in the fixed point set of a nonexpansive mapping to the initial point. However, in practice, it is not necessarily true that this algorithm can be applied to large-scale fixed point problems, since the computation of the nonexpansive mapping is expensive. In this paper, we present mini-batch stochastic Halpern algorithm to resolve the issue caused by the computational difficulty of the mapping. We preform a convergence analysis demonstrating that the algorithm with diminishing step sizes and increasing batch sizes converges in mean square to the closest point in the fixed point set to the initial point. We also perform a convergence rate analysis demonstrating that convergence speed of the algorithm depends on the settings of the diminishing step sizes.
Jinyang Shi, Luo Luo
This paper considers constrained stochastic nonsmooth minimax optimization problem of the form $\min_{\mathbf{x}\in\mathcal{X}}\max_{\mathbf{y}\in\mathcal{Y}}f\left(\mathbf{x},\mathbf{y}\right)=\mathbb{E}[F(\mathbf{x},\mathbf{y};\mathbfξ)]$, where the objective $f(\mathbf{x},\mathbf{y})$ is concave in $\mathbf{y}$ but possibly nonconvex in $\mathbf{x}$, the stochastic component $F(\mathbf{x},\mathbf{y};\mathbfξ)$ indexed by random variable $\mathbfξ$ is mean-squared Lipschitz continuous, and the feasible sets $\mathcal X$ and $\mathcal Y$ are convex and compact. We introduce the notion of $(η_x,η_y,δ,ε)$-Goldstein saddle stationary point (GSSP) to characterize the convergence for solving constrained nonsmooth minimax problems. We then develop projected gradient-free descent ascent methods for finding $(η_x,η_y,δ,ε)$-GSSPs of the objective function $f(\mathbf{x},\mathbf{y})$ with non-asymptotic convergence rates. We further propose nested-loop projected gradient-free descent ascent methods to establish the non-asymptotic convergence for finding $(η,δ,ε)$-generalized Goldstein stationary points (GGSP) [Liu et al., 2024] of the primal function $Φ(\mathbf{x})\triangleq\max_{\mathbf{y}\in\mathcal{Y}}{f}\left(\mathbf{x},\mathbf{y}\right)$. It is worth noting that our algorithm designs and theoretical analyses do not require additional assumptions such as the weak convexity used in prior works on nonsmooth minimax optimization [Lin et al., 2025, Boţ and Böhm, 2023].
Yanxu Su, Xiaorui Tong, Changyin Sun
Zeroth-order (ZO) optimization is indispensable for complex non-convex tasks where explicit gradients are computationally prohibitive or strictly inaccessible. For deploying ZO methods over distributed heterogeneous networks, the gradient tracking technique is often employed to eliminate structural data biases. However, the inherent variance of derivative-free estimators is also amplified. To overcome this problem, we propose Zeroth-Order Momentum Gradient Tracking (ZO-MGT), which integrates momentum-based variance reduction with dynamic gradient tracking. Specifically, ZO-MGT that requires exactly two function queries per iteration can avoid costly batch sampling and prevent variance explosion, while eliminating structural biases. Moreover, by utilizing Rademacher perturbations, it preserves optimal query efficiency and enables bitwise hardware acceleration. We theoretically analyze the convergence of ZO-MGT and establish an $\mathcal{O}(1/T)$ convergence rate. Furthermore, we prove that a large momentum factor can aggressively suppress the heterogeneity-induced bias floor at a remarkable quadratic rate of $\mathcal{O}((1-β)^2)$. Numerical experiments under extreme data heterogeneity verify that ZO-MGT can effectively overcome traditional tracking failures with accelerated convergence guarantees, while achieving significantly tighter consensus.
Sridhar Babu Mudhangulla, Olugbenga Moses Anubi
This paper presents a unified string-stability framework for leader-follower multi-agent systems governed by first-, second-, and m-th order consensus protocols operating under an r-predecessor directed communication topology. While string stability has been extensively studied for specific vehicle models and individual consensus protocols, existing results remain fragmented across protocol orders and do not identify the fundamental factors governing disturbance amplification or attenuation. This work shows that, for all consensus orders, string stability is dictated solely by the communication richness r, while the protocol order m influences only the mid-frequency transient behavior. In particular, the low-frequency gain of the disturbance propagation coefficient is inversely proportional to r for every m, implying that higher-order consensus cannot overcome the structural limitation imposed by insufficient communication and that, under the adopted H-infinity-based string-stability definition and the present framework, string stability is achievable if and only if r >= 2. This establishes a structural-dynamic separation principle that unifies and generalizes classical platoon results, providing new insight into the interplay between topology and controller design in cooperative driving and multi-agent coordination. The framework is developed under idealized identical-agent and fixed-topology assumptions, providing a baseline for future robust extensions. Numerical simulations corroborate the analysis and illustrate how m and r jointly shape disturbance propagation along the formation.
Yichen Zhou, Stephen Tu
There has been remarkable progress over the past decade in establishing finite-sample, non-asymptotic bounds on recovering unknown system parameters from observed system behavior. Surprisingly, however, we show that the current state-of-the-art bounds do not accurately capture the statistical complexity of system identification, even in the most fundamental setting of estimating a discrete-time linear dynamical system (LDS) via ordinary least-squares regression (OLS). Specifically, we utilize asymptotic normality to identify classes of problem instances for which current bounds overstate the squared parameter error, in both spectral and Frobenius norm, by a factor of the state-dimension of the system. Informed by this discrepancy, we then sharpen the OLS parameter error bounds via a novel second-order decomposition of the parameter error, where crucially the lower-order term is a matrix-valued martingale that we show correctly captures the CLT scaling. From our analysis we obtain finite-sample bounds for both (i) stable systems and (ii) the many-trajectories setting that match the instance-specific optimal rates up to constant factors in Frobenius norm, and polylogarithmic state-dimension factors in spectral norm.
Ge Chen, Yiwei Qiu, Shiyao Zhang, Pengfei Su, Haoran Deng, Hongcai Zhang
This paper proposes a scalable coordination framework with aggregator-side privacy protection for storage-like distributed energy resources (DERs). The framework adopts a two-layer architecture. At the macroscopic layer, building upon an \emph{Eulerian} modeling perspective, the DER population is represented as a continuum whose density evolution is governed by a partial differential equation (PDE), such that the computational complexity is independent of the population size. To address the bilinear non-convexity in this PDE-constrained optimization problem, we develop a convexification method that combines finite-volume discretization with a flux-lifting technique, reformulating the macroscopic problem into a sparse linear program (LP). The LP solution yields a unified, state-dependent broadcast signal for population coordination. Furthermore, a Wasserstein-based relaxation is introduced to replace rigid cyclic constraints and provide additional operational flexibility for improved economic performance. At the microscopic layer, individual resources autonomously recover local setpoints from the broadcast signal and their local states, while an upstream data-mixing protocol aggregates individual states into a macroscopic density histogram without exposing raw individual states to the aggregator. Numerical studies validate the scalability, feasibility, and economic effectiveness of the proposed framework.
Manou Rosenberg, Mengbin Ye, Brian D. O. Anderson
The Euclidean Steiner tree problem, normally posed in two dimensions, seeks to connect a set of prescribed terminal nodes by placing additional nodes, known as Steiner points, with edges connecting such nodes either to another Steiner point or a terminal node, and with the placements minimising the sum of all the edge lengths of the associated tree. We consider a problem in which we start with a known solution to a Steiner tree problem, and the terminal positions are then perturbed. A first-order approximation theorem is established for efficiently updating the Steiner point positions to recover a Steiner tree solution after the perturbations to terminal nodes. Numerical examples illustrate the effectiveness of our approach (including a stepwise application for large perturbations) as well as its limitations.
Huyên Pham, Yuming Paul Zhang, Yuhua Zhu
This paper establishes a rigorous connection between regularized discrete-time reinforcement learning (RL) and continuous-time stochastic optimal control. Specifically, classical RL algorithms are typically solving a regularized discrete-time Bellman equation. We study the discretization error, namely, the gap between the optimal policy induced by the regularized discrete-time Bellman equation and the true optimal feedback control of the underlying continuous-time stochastic control problem. By deriving quantitative convergence rates for this gap, we provide a rigorous foundation for understanding the stability and implementation of exploratory RL policies in stochastic continuous-time environments.
Toshinori Kitamura, Arnob Ghosh, Alex Ayoub, Thang D. Chu, Csaba Szepesvári
Projected subgradient descent (PSD) has gained popularity for solving robust Markov decision processes (RMDPs) because it applies to a broader class of uncertainty sets than traditional dynamic programming. Existing work claims that RMDPs with a general compact uncertainty set satisfy the subgradient dominance property, under which exact PSD converges to an $\varepsilon$-optimal policy in a polynomial number of updates (e.g., Wang et al., 2023). We show that these claims are incorrect. Even when the uncertainty set has cardinality two, the RMDP objective is not subgradient-dominant and can admit suboptimal strict local minima. Moreover, we prove that finding an $\varepsilon$-optimal policy can be NP-hard even in settings where subgradients are efficiently computable: (i) finite transition uncertainty sets and (ii) $sa$-rectangular finite transition uncertainty sets with finite cost uncertainty sets. Finally, we identify two conditions under which RMDPs do satisfy subgradient dominance: when, for each policy, either the worst-case transition kernel or the worst-case action-value function is unique.
Yunier Bello-Cruz
The Method of Ellipcenters (ME), introduced in~\cite{ME2025} for strongly convex quadratic minimization, uses two gradient evaluations per iteration: one at the current iterate and one at a companion point on the same level set. We extend ME to the broader class of strongly convex functions with Lipschitz continuous gradient, and prove that ME matches the convergence rate of gradient descent with exact line search on this class. When the two gradient directions are linearly independent, a midpoint argument exploiting the level-set symmetry yields a further per-step improvement, which is global when the angle between the two gradients is uniformly bounded away from zero. ME also converges in at most two steps in dimension two. Numerical experiments on regularized logistic regression confirm the theoretical predictions.
Giulia Gatti, Giacomo Como
We study the Suscectible-Infected-Recovered-Susceptible (SIRS) epidemic model on deterministic networks. For connected but otherwise general interaction patterns and heterogeneous recovery and loss-of-immunity rates, we identify a fundamental parameter R_0 (the basic reproduction number), which fully characterizes the qualitative dynamic behavior of the system. This parameter is the dominant eigenvalue of a rescaled version of the interaction matrix, whose rows are normalized by the corresponding recovery rates. We prove that a transcritical bifurcation occurs as R_0 crosses the threshold value 1. Specifically, we show that, if R_0 does not exceed 1, then the disease-free equilibrium is globally asymptotically stable, whereas, if R_0 is larger than 1, then the disease-free equilibrium is unstable and there exists a unique endemic equilibrium, which is asymptotically stable. As a byproduct of our analysis, we also identify key monotonicity properties of the dependence of the endemic equilibrium on the model parameters (the interaction matrix as well as the recovery rates and the loss-of-immunity rates) and obtain a distributed iterative algorithm for its computation, with provable convergence guarantees. Our results extend existing ones available in the literature for network SIRS epidemic models with rank-one interaction matrices and homogeneous recovery rates (including the single homogeneous population SIRS epidemic model).