Björn Engquist, Kui Ren, Yunan Yang
The generalization capacity of various machine learning models exhibits different phenomena in the under- and over-parameterized regimes. In this paper, we focus on regression models such as feature regression and kernel regression and analyze a generalized weighted least-squares optimization method for computational learning and inversion with noisy data. The highlight of the proposed framework is that we allow weighting in both the parameter space and the data space. The weighting scheme encodes both a priori knowledge on the object to be learned and a strategy to weight the contribution of different data points in the loss function. Here, we characterize the impact of the weighting scheme on the generalization error of the learning method, where we derive explicit generalization errors for the random Fourier feature model in both the under- and over-parameterized regimes. For more general feature maps, error bounds are provided based on the singular values of the feature matrix. We demonstrate that appropriate weighting from prior knowledge can improve the generalization capability of the learned model.
Björn Engquist, Henrik Holst, Olof Runborg
Multiscale problems are computationally costly to solve by direct simulation because the smallest scales must be represented over a domain determined by the largest scales of the problem. We have developed and analyzed new numerical methods for multiscale wave propagation following the framework of the heterogeneous multiscale method. The numerical methods couple simulations on macro- and microscales for problems with rapidly fluctuating material coefficients. The computational complexity of the new method is significantly lower than that of traditional techniques. We focus on HMM approximation applied to long time integration of one-dimensional wave propagation problems in both periodic and non-periodic medium and show that the dispersive effect that appear after long time is fully captured.
Björn Engquist, Christina Frederick
In homogenization theory and multiscale modeling, typical functions satisfy the scaling law $f^ε(x) = f(x,x/ε)$, where $f$ is periodic in the second variable and $ε$ is the smallest relevant wavelength, $0<ε\ll1$. Our main result is a new $L^{2}$-stability estimate for the reconstruction of such bandlimited multiscale functions $f^ε$ from periodic nonuniform samples. The goal of this paper is to demonstrate the close relation between and sampling strategies developed in information theory and computational grids in multiscale modeling. This connection is of much interest because numerical simulations often involve discretizations by means of sampling, and meshes are routinely designed using tools from information theory. The proposed sampling sets are of optimal rate according to the minimal sampling requirements of Landau \cite{Landau}.
Bjorn Engquist, Henrik Holst, Olof Runborg
Multi-scale wave propagation problems are computationally costly to solve by traditional techniques because the smallest scales must be represented over a domain determined by the largest scales of the problem. We have developed and analyzed new numerical methods for multi-scale wave propagation in the framework of heterogeneous multi-scale method. The numerical methods couples simulations on macro- and micro-scales for problems with rapidly oscillating coefficients. We show that the complexity of the new method is significantly lower than that of traditional techniques with a computational cost that is essentially independent of the micro-scale. A convergence proof is given and numerical results are presented for periodic problems in one, two and three dimensions. The method is also successfully applied to non-periodic problems and for long time integration where dispersive effects occur.
Björn Engquist, Kui Ren, Yunan Yang
This paper develops and analyzes a stochastic derivative-free optimization strategy. A key feature is the state-dependent adaptive variance. We prove global convergence in probability with algebraic rate and give the quantitative results in numerical examples. A striking fact is that convergence is achieved without explicit information of the gradient and even without comparing different objective function values as in established methods such as the simplex method and simulated annealing. It can otherwise be compared to annealing with state-dependent temperature.
Björn Engquist, Yunan Yang
Seismology has been an active science for a long time. It changed character about 50 years ago when the earth's vibrations could be measured on the surface more accurately and more frequently in space and time. The full wave field could be determined, and partial differential equations (PDE) started to be used in the inverse process of finding properties of the interior of the earth. We will briefly review earlier techniques but mainly focus on Full Waveform Inversion (FWI) for the acoustic formulation. FWI is a PDE constrained optimization in which the variable velocity in a forward wave equation is adjusted such that the solution matches measured data on the surface. The minimization of the mismatch is usually coupled with the adjoint state method, which also includes the solution to an adjoint wave equation. The least-squares norm is the conventional objective function measuring the difference between simulated and measured data, but it often results in the minimization trapped in local minima. One way to mitigate this is by selecting another misfit function with better convexity properties. Here we propose using the quadratic Wasserstein metric as a new misfit function in FWI. The optimal map defining the quadratic Wasserstein metric can be computed by solving a Monge-Ampere equation. Theorems pointing to the advantages of using optimal transport over the least-squares norm will be discussed, and a number of large-scale computational examples will be presented.
Sean P. Carney, Björn Engquist, Robert D. Moser
Recent experimental and computational studies indicate that near wall turbulent flows can be characterized by universal small scale autonomous dynamics that are modulated by large scale structures. We formulate numerical simulations of near wall turbulence in a small domain localized to the boundary, whose size scales in viscous units. To mimic the environment in which the near wall turbulence evolves, our formulation accounts for the flux of mean momentum through the upper boundary of the domain. Comparisons of the model's two dimensional energy spectra and low order single-point statistics with the corresponding quantities computed from zero and mild favorable pressure gradient direct numerical simulations indicate it successfully captures the dynamics of the small scale near wall turbulence.
Bjorn Engquist, Brittany D. Froese, Yunan Yang
Full waveform inversion is a successful procedure for determining properties of the earth from surface measurements in seismology. This inverse problem is solved by a PDE constrained optimization where unknown coefficients in a computed wavefield are adjusted to minimize the mismatch with the measured data. We propose using the Wasserstein metric, which is related to optimal transport, for measuring this mismatch. Several advantageous properties are proved with regards to convexity of the objective function and robustness with respect to noise. The Wasserstein metric is computed by solving a Monge-Ampere equation. We describe an algorithm for computing its Frechet gradient for use in the optimization. Numerical examples are given.
Yunan Yang, Björn Engquist, Junzhe Sun, Brittany D. Froese
Conventional full-waveform inversion (FWI) using the least-squares norm ($L^2$) as a misfit function is known to suffer from cycle skipping. This increases the risk of computing a local rather than the global minimum of the misfit. In our previous work, we proposed the quadratic Wasserstein metric ($W_2$) as a new misfit function for FWI. The $W_2$ metric has been proved to have many ideal properties with regards to convexity and insensitivity to noise. When the observed and predicted seismic data are regarded as two density functions, the quadratic Wasserstein metric corresponds to the optimal cost of rearranging one density into the other, where the transportation cost is quadratic in distance. The difficulty of transforming seismic signals into nonnegative density functions is discussed. Unlike the $L^2$ norm, $W_2$ measures not only amplitude differences, but also global phase shifts, which helps to avoid cycle skipping issues. In this work, we build on our earlier method to cover more realistic high-resolution applications by embedding the $W_2$ technique into the framework of the adjoint-state method and applying it to seismic relevant 2D examples: the Camembert, the Marmousi, and the 2004 BP models. We propose a new way of using the $W_2$ metric trace-by-trace in FWI and compare it to global $W_2$ via the solution of the Monge-Ampère equation. With corresponding adjoint source, the velocity model can be updated using the l-BFGS method. Numerical results show the effectiveness of $W_2$ for alleviating cycle skipping issues and sensitivity to noise. Both mathematical theory and numerical examples demonstrate that the quadratic Wasserstein metric is a good candidate for a misfit function in seismic inversion.
Björn Engquist, Lexing Ying
The paper introduces the sweeping preconditioner, which is highly efficient for iterative solutions of the variable coefficient Helmholtz equation including very high frequency problems. The first central idea of this novel approach is to construct an approximate factorization of the discretized Helmholtz equation by sweeping the domain layer by layer, starting from an absorbing layer or boundary condition. Given this specific order of factorization, the second central idea of this approach is to represent the intermediate matrices in the hierarchical matrix framework. In two dimensions, both the construction and the application of the preconditioners are of linear complexity. The GMRES solver with the resulting preconditioner converges in an amazingly small number of iterations, which is essentially independent of the number of unknowns. This approach is also extended to the three dimensional case with some success. Numerical results are provided in both two and three dimensions to demonstrate the efficiency of this new approach.
Bjorn Engquist, Brittany D. Froese, Yen-Hsi Richard Tsai
The idea of using fast sweeping methods for solving stationary systems of conservation laws has previously been proposed for efficiently computing solutions with sharp shocks. We further develop these methods to allow for a more challenging class of problems including problems with sonic points, shocks originating in the interior of the domain, rarefaction waves, and two-dimensional systems. We show that fast sweeping methods can produce higher-order accuracy. Computational results validate the claims of accuracy, sharp shock curves, and optimal computational efficiency.
Qiang Du, Bjorn Engquist, Xiaochuan Tian
In this work, we review the connection between the subjects of homogenization and nonlocal modeling and discuss the relevant computational issues. By further exploring this connection, we hope to promote the cross fertilization of ideas from the different research fronts. We illustrate how homogenization may help characterizing the nature and the form of nonlocal interactions hypothesized in nonlocal models. We also offer some perspective on how studies of nonlocality may help the development of more effective numerical methods for homogenization.
Gopal R. Yalla, Todd A. Oliver, Sigfried W. Haering, Björn Engquist, Robert D. Moser
Large Eddy Simulation (LES) of turbulence in complex geometries is often conducted using strongly inhomogeneous resolution. The issues associated with resolution inhomogeneity are related to the noncommutativity of the filtering and differentiation operators, which introduces a commutation term into the governing equations. Neglect of this commutation term gives rise to commutation error. While the commutation error is well recognized, it is often ignored in practice. Moreover, the commutation error arising from the implicit part of the filter (i.e., projection onto the underlying discretization) has not been well investigated. Modeling the commutator between numerical projection and differentiation is crucial for correcting errors induced by resolution inhomogeneity in practical LES settings, which typically rely solely on implicit filtering. Here, we employ a multiscale asymptotic analysis to investigate the characteristics of the commutator. This provides a statistical description of the commutator, which can serve as a target for the statistical characteristics of a commutator model. Further, we investigate how commutation error manifests in simulation and demonstrate its impact on the convection of a packet of homogeneous isotropic turbulence through an inhomogeneous grid. A connection is made between the commutation error and the propagation properties of the underlying numerics. A modeling approach for the commutator is proposed that is applicable to LES with filters that include projections to the discrete solution space and that respects the numerical properties of the LES evolution equation. It may also be useful in addressing other LES modeling issues such as discretization error.
Bjorn Engquist, Yunan Yang
Full-waveform inversion (FWI) is today a standard process for the inverse problem of seismic imaging. PDE-constrained optimization is used to determine unknown parameters in a wave equation that represent geophysical properties. The objective function measures the misfit between the observed data and the calculated synthetic data, and it has traditionally been the least-squares norm. In a sequence of papers, we introduced the Wasserstein metric from optimal transport as an alternative misfit function for mitigating the so-called cycle skipping, which is the trapping of the optimization process in local minima. In this paper, we first give a sharper theorem regarding the convexity of the Wasserstein metric as the objective function. We then focus on two new issues. One is the necessary normalization of turning seismic signals into probability measures such that the theory of optimal transport applies. The other, which is beyond cycle skipping, is the inversion for parameters below reflecting interfaces. For the first, we propose a class of normalizations and prove several favorable properties for this class. For the latter, we demonstrate that FWI using optimal transport can recover geophysical properties from domains where no seismic waves travel through. We finally illustrate these properties by the realistic application of imaging salt inclusions, which has been a significant challenge in exploration geophysics.
Bjorn Engquist, Brittany D. Froese
Seismic signals are typically compared using travel time difference or $L_2$ difference. We propose the Wasserstein metric as an alternative measure of fidelity or misfit in seismology. It exhibits properties from both of the traditional measures mentioned above. The numerical computation is based on the recent development of fast numerical methods for the Monge-Ampere equation and optimal transport. Applications to waveform inversion and registration are discussed and simple numerical examples are presented.
Yoonsang Lee, Bjorn Engquist
We propose a fast integrator to a class of dynamical systems with several temporal scales. The proposed method is developed as an extension of the variable step size Heterogeneous Multiscale Method (VSHMM), which is a two-scale integrator developed by the authors. While iterated applications of multiscale integrators for two different scales increase the computational complexity exponentially as the number of different scales increases, the proposed method, on the other hand, has computational complexity linearly proportional to the number of different scales. This efficiency is achieved by solving different scale components of the vector fields with variable time steps. It is shown that variable time stepping of different force components has an effect of fast integration for the effective force of the slow dynamics. The proposed fast integrator is numerically tested on problems with several different scales which are dissipative and highly oscillatory including multiscale partial differential equations with sparsity in the solution space.
Yacine Mokhtari, Christina Frederick, Yunan Yang, Bjorn Engquist
We study the inverse problem of reconstructing an incompressible velocity field $\boldsymbol{v}$ from observations of the induced magnetic field $\boldsymbol{b}$. In the presence of a strong, constant background field $\mathbf{F}$, the evolution of the magnetic perturbation $\boldsymbol{b}$ is governed by the linearized induction equation. We analyze the system on both the entire space $Ω= \mathbb{R}^d$ and a periodic domain $Ω= \prod_{i=1}^d [0, L_i)$, which models a homogeneous medium with side lengths $L_i > 0$. We analyze this problem by decomposing it into the injectivity of a parabolic forward map and the solvability of a divergence-free transport sub-problem. On the whole space $\mathbb{R}^d$, we show that the transport sub-problem is well-posed when data is prescribed on a non-characteristic hypersurface transverse to $\mathbf{F}$. On the torus, we establish a sharp uniqueness criterion based on the rational dependence of the ratios $\{F_i/L_i\}_{i=1}^d$ between the background-field components and the corresponding domain periods. Furthermore, we show that for the reconstructed velocity to belong to $L^2$, a sufficient condition is that the background field must satisfy a Diophantine condition. The proof combines injectivity of the parabolic forward map with uniqueness for a steady transport equation along $\mathbf{F}$.
Björn Engquist, Lexing Ying
This paper introduces a new sweeping preconditioner for the iterative solution of the variable coefficient Helmholtz equation in two and three dimensions. The algorithms follow the general structure of constructing an approximate $LDL^t$ factorization by eliminating the unknowns layer by layer starting from an absorbing layer or boundary condition. The central idea of this paper is to approximate the Schur complement matrices of the factorization using moving perfectly matched layers (PMLs) introduced in the interior of the domain. Applying each Schur complement matrix is equivalent to solving a quasi-1D problem with a banded LU factorization in the 2D case and to solving a quasi-2D problem with a multifrontal method in the 3D case. The resulting preconditioner has linear application cost and the preconditioned iterative solver converges in a number of iterations that is essentially indefinite of the number of unknowns or the frequency. Numerical results are presented in both two and three dimensions to demonstrate the efficiency of this new preconditioner.
Sean P. Carney, Milica Dussinger, Bjorn Engquist
Numerical homogenization of multiscale equations typically requires taking an average of the solution to a microscale problem. Both the boundary conditions and domain size of the microscale problem play an important role in the accuracy of the homogenization procedure. In particular, imposing naive boundary conditions leads to a $\mathcal{O}(ε/η)$ error in the computation, where $ε$ is the characteristic size of the microscopic fluctuations in the heterogeneous media, and $η$ is the size of the microscopic domain. This so-called boundary, or ``cell resonance" error can dominate discretization error and pollute the entire homogenization scheme. There exist several techniques in the literature to reduce the error. Most strategies involve modifying the form of the microscale cell problem. Below we present an alternative procedure based on the observation that the resonance error itself is an oscillatory function of domain size $η$. After rigorously characterizing the oscillatory behavior for one dimensional and quasi-one dimensional microscale domains, we present a novel strategy to reduce the resonance error. Rather than modifying the form of the cell problem, the original problem is solved for a sequence of domain sizes, and the results are averaged against kernels satisfying certain moment conditions and regularity properties. Numerical examples in one and two dimensions illustrate the utility of the approach.
Xiaochuan Tian, Bjorn Engquist
Developments of nonlocal operators for modeling processes that traditionally have been described by local differential operators have been increasingly active during the last few years. One example is peridynamics for brittle materials and another is nonstandard diffusion including the use of fractional derivatives. A major obstacle for application of these methods is the high computational cost from the numerical implementation of the nonlocal operators. It is natural to consider fast methods of fast multipole or hierarchical matrix type to overcome this challenge. Unfortunately the relevant kernels do not satisfy the standard necessary conditions. In this work a new class of fast algorithms is developed and analyzed, which is some cases reduces the computational complexity of applying nonlocal operators to essentially the same order of magnitude as the complexity of standard local numerical methods.