Mingyu Mo, Yimin Wei, Qi Ye
In this paper, we use the splitting method to solve support vector machine in reproducing kernel Banach space with lower semi-continuous loss function. We equivalently transfer support vector machines in reproducing kernel Banach space with lower semi-continuous loss function to a finite-dimensional tensor Optimization and propose the splitting method based on alternating direction method of multipliers. By Kurdyka-Lojasiewicz inequality, the iterative sequence obtained by this splitting method is globally convergent to a stationary point if the loss function is lower semi-continuous and subanalytic. Finally, several numerical performances demonstrate the effectiveness.
Qi Ye
This article delves into the study of the theory of regularized learning in Banach spaces for linear-functional data. It encompasses discussions on representer theorems, pseudo-approximation theorems, and convergence theorems. Regularized learning is designed to minimize regularized empirical risks over a Banach space. The empirical risks are calculated by utilizing training data and multi-loss functions. The input training data are composed of linear functionals in a predual space of the Banach space to capture discrete local information from multimodal data and multiscale models. Through the regularized learning, approximations of the exact solution to an unidentified or uncertain original problem are globally achieved. In the convergence theorems, the convergence of the approximate solutions to the exact solution is established through the utilization of the weak* topology of the Banach space. The theorems of regularized learning are utilized in the interpretation of classical machine learning, such as support vector machines and artificial neural networks.
Igor Cialenco, Gregory E. Fasshauer, Qi Ye
In this paper we present the theoretical framework needed to justify the use of a kernel-based collocation method (meshfree approximation method) to estimate the solution of high-dimensional stochastic partial differential equations (SPDEs). Using an implicit time stepping scheme, we transform stochastic parabolic equations into stochastic elliptic equations. Our main attention is concentrated on the numerical solution of the elliptic equations at each time step. The estimator of the solution of the elliptic equations is given as a linear combination of reproducing kernels derived from the differential and boundary operators of the SPDE centered at collocation points to be chosen by the user. The random expansion coefficients are computed by solving a random system of linear equations. Numerical experiments demonstrate the feasibility of the method.
Yuesheng Xu, Qi Ye
This article studies constructions of reproducing kernel Banach spaces (RKBSs) which may be viewed as a generalization of reproducing kernel Hilbert spaces (RKHSs). A key point is to endow Banach spaces with reproducing kernels such that machine learning in RKBSs can be well-posed and of easy implementation. First we verify many advanced properties of the general RKBSs such as density, continuity, separability, implicit representation, imbedding, compactness, representer theorem for learning methods, oracle inequality, and universal approximation. Then, we develop a new concept of generalized Mercer kernels to construct $p$-norm RKBSs for $1\leq p\leq\infty$. The $p$-norm RKBSs preserve the same simple format as the Mercer representation of RKHSs. Moreover, the $p$-norm RKBSs are isometrically equivalent to the standard $p$-norm spaces of countable sequences. Hence, the $p$-norm RKBSs possess more geometrical structures than RKHSs including sparsity. The generalized Mercer kernels also cover many well-known kernels, for example, min kernels, Gaussian kernels, and power series kernels. Finally, we propose to solve the support vector machines in the $p$-norm RKBSs, which are to minimize the regularized empirical risks over the $p$-norm RKBSs. We show that the infinite dimensional support vector machines in the $p$-norm RKBSs can be equivalently transferred to finite dimensional convex optimization problems such that we obtain the finite dimensional representations of the support vector machine solutions for practical applications. In particular, we verify that some special support vector machines in the $1$-norm RKBSs are equivalent to the classical $1$-norm sparse regressions. This gives fundamental supports of a novel learning tool called sparse learning methods to be investigated in our next research project.
Qi Ye
In this paper we introduce a generalization of the classical $\Leb_2(\Rd)$-based Sobolev spaces with the help of a vector differential operator $\mathbf{P}$ which consists of finitely or countably many differential operators $P_n$ which themselves are linear combinations of distributional derivatives. We find that certain proper full-space Green functions $G$ with respect to $L=\mathbf{P}^{\ast T}\mathbf{P}$ are positive definite functions. Here we ensure that the vector distributional adjoint operator $\mathbf{P}^{\ast}$ of $\mathbf{P}$ is well-defined in the distributional sense. We then provide sufficient conditions under which our generalized Sobolev space will become a reproducing-kernel Hilbert space whose reproducing kernel can be computed via the associated Green function $G$. As an application of this theoretical framework we use $G$ to construct multivariate minimum-norm interpolants $s_{f,X}$ to data sampled from a generalized Sobolev function $f$ on $X$. Among other examples we show the reproducing-kernel Hilbert space of the Gaussian function is equivalent to a generalized Sobolev space.
Qi Ye
In this article, we use the knowledge of positive definite tensors to develop a concept of positive definite multi-kernels to construct the kernel-based interpolants of scattered data. By the techniques of reproducing kernel Banach spaces, the optimal recoveries and error analysis of the kernel-based interpolants are shown for a special class of strictly positive definite multi-kernels.
Qi Ye
This article gives a new insight of kernel-based (approximation) methods to solve the high-dimensional stochastic partial differential equations. We will combine the techniques of meshfree approximation and kriging interpolation to extend the kernel-based methods for the deterministic data to the stochastic data. The main idea is to endow the Sobolev spaces with the probability measures induced by the positive definite kernels such that the Gaussian random variables can be well-defined on the Sobolev spaces. The constructions of these Gaussian random variables provide the kernel-based approximate solutions of the stochastic models. In the numerical examples of the stochastic Poisson and heat equations, we show that the approximate probability distributions are well-posed for various kinds of kernels such as the compactly supported kernels (Wendland functions) and the Sobolev-spline kernels (Matérn functions).
Gregory E. Fasshauer, Qi Ye
We introduce a vector differential operator $\mathbf{P}$ and a vector boundary operator $\mathbf{B}$ to derive a reproducing kernel along with its associated Hilbert space which is shown to be embedded in a classical Sobolev space. This reproducing kernel is a Green kernel of differential operator $L:=\mathbf{P}^{\ast T}\mathbf{P}$ with homogeneous or nonhomogeneous boundary conditions given by $\mathbf{B}$, where we ensure that the distributional adjoint operator $\mathbf{P}^{\ast}$ of $\mathbf{P}$ is well-defined in the distributional sense. We represent the inner product of the reproducing-kernel Hilbert space in terms of the operators $\mathbf{P}$ and $\mathbf{B}$. In addition, we find relationships for the eigenfunctions and eigenvalues of the reproducing kernel and the operators with homogeneous or nonhomogeneous boundary conditions. These eigenfunctions and eigenvalues are used to compute a series expansion of the reproducing kernel and an orthonormal basis of the reproducing-kernel Hilbert space. Our theoretical results provide perhaps a more intuitive way of understanding what kind of functions are well approximated by the reproducing kernel-based interpolant to a given multivariate data sample.
Mingjia Huo, Kewen Wu, Qi Ye
Bootstrapping is a crucial but computationally expensive step for realizing Fully Homomorphic Encryption (FHE). Recently, Chen and Han (Eurocrypt 2018) introduced a family of low-degree polynomials to extract the lowest digit with respect to a certain congruence, which helps improve the bootstrapping for both FV and BGV schemes. In this note, we present the following relevant findings about the work of Chen and Han (referred to as CH18): 1. We provide a simpler construction of the low-degree polynomials that serve the same purpose and match the asymptotic bound achieved in CH18; 2. We show the optimality and limit of our approach by solving a minimal polynomial degree problem; 3. We consider the problem of extracting other low-order digits using polynomials, and provide negative results.
Qi Ye
In this article, we solve a deterministically generalized interpolation problem by a stochastic approach. We introduce a kernel-based probability measure on a Banach space by a covariance kernel which is defined on the dual space of the Banach space. The kernel-based probability measure provides a numerical tool to construct and analyze the kernel-based estimators conditioned on non-noise data or noisy data including algorithms and error analysis. Same as meshfree methods, we can also obtain the kernel-based approximate solutions of elliptic partial differential equations by the kernel-based probability measure.
Ye Qi, Devendra Singh Sachan, Matthieu Felix, Sarguna Janani Padmanabhan, Graham Neubig
The performance of Neural Machine Translation (NMT) systems often suffers in low-resource scenarios where sufficiently large-scale parallel corpora cannot be obtained. Pre-trained word embeddings have proven to be invaluable for improving performance in natural language analysis tasks, which often suffer from paucity of data. However, their utility for NMT has not been extensively explored. In this work, we perform five sets of experiments that analyze when we can expect pre-trained word embeddings to help in NMT tasks. We show that such embeddings can be surprisingly effective in some cases -- providing gains of up to 20 BLEU points in the most favorable setting.
Zongxia Liang, Qi Ye
May 15, 2024·q-fin.MF·PDF This paper diverges from previous literature by considering the utility maximization problem in the context of investors having the freedom to actively acquire additional information to mitigate estimation risk. We derive closed-form value functions using CARA and CRRA utility functions and establish a criterion for valuing extra information through certainty equivalence, while also formulating its associated acquisition cost. By strategically employing variational methods, we explore the optimal acquisition of information, taking into account the trade-off between its value and cost. Our findings indicate that acquiring earlier information holds greater worth in eliminating estimation risk and achieving higher utility. Furthermore, we observe that investors with lower risk aversion are more inclined to pursue information acquisition.
Zongxia Liang, Qi Ye
This paper delves into financial markets that incorporate a novel form of heterogeneity among investors, specifically in terms of their beliefs regarding the reliability of signals in the business cycle economy model, which may be biased. Unlike most papers in this field, we not only analyze the equilibrium but also examine welfare using objective measures while investors aim to maximize their utility based on subjective measures. Furthermore, we introduce passive investors and use their utility as a benchmark, thereby revealing the phenomenon of double loss sometimes. In the analysis, we examine two effects: the distortion effect on total welfare and the advantage effect of information and highlight their key factors of influence, with a particular emphasis on the proportion of investors. We also demonstrate that manipulating investors' estimation towards the economy can be a way to improve utility and identify an inner connection between welfare and survival.
Mingyu Mo, Qi Ye
In this paper, we study the splitting method based on alternating direction method of multipliers for support vector machine in reproducing kernel Hilbert space with lower semi-continuous loss function. If the loss function is lower semi-continuous and subanalytic, we use the Kurdyka-Lojasiewicz inequality to show that the iterative sequence induced by the splitting method globally converges to a stationary point. The numerical experiments also demonstrate the effectiveness of the splitting method.
Qi Ye, Lihua Guo
In recent years, "U-shaped" neural networks featuring encoder and decoder structures have gained popularity in the field of medical image segmentation. Various variants of this model have been developed. Nevertheless, the evaluation of these models has received less attention compared to model development. In response, we propose a comprehensive method for evaluating medical image segmentation models for multi-indicator and multi-organ (named MIMO). MIMO allows models to generate independent thresholds which are then combined with multi-indicator evaluation and confidence estimation to screen and measure each organ. As a result, MIMO offers detailed information on the segmentation of each organ in each sample, thereby aiding developers in analyzing and improving the model. Additionally, MIMO can produce concise usability and comprehensiveness scores for different models. Models with higher scores are deemed to be excellent models, which is convenient for clinical evaluation. Our research tests eight different medical image segmentation models on two abdominal multi-organ datasets and evaluates them from four perspectives: correctness, confidence estimation, Usable Region and MIMO. Furthermore, robustness experiments are tested. Experimental results demonstrate that MIMO offers novel insights into multi-indicator and multi-organ medical image evaluation and provides a specific and concise measure for the usability and comprehensiveness of the model. Code: https://github.com/SCUT-ML-GUO/MIMO
Gregory E. Fasshauer, Fred J. Hickernell, Qi Ye
In this paper we solve support vector machines in reproducing kernel Banach spaces with reproducing kernels defined on nonsymmetric domains instead of the traditional methods in reproducing kernel Hilbert spaces. Using the orthogonality of semi-inner-products, we can obtain the explicit representations of the dual (normalized-duality-mapping) elements of support vector machine solutions. In addition, we can introduce the reproduction property in a generalized native space by Fourier transform techniques such that it becomes a reproducing kernel Banach space, which can be even embedded into Sobolev spaces, and its reproducing kernel is set up by the related positive definite function. The representations of the optimal solutions of support vector machines (regularized empirical risks) in these reproducing kernel Banach spaces are formulated explicitly in terms of positive definite functions, and their finite numbers of coefficients can be computed by fixed point iteration. We also give some typical examples of reproducing kernel Banach spaces induced by Matérn functions (Sobolev splines) so that their support vector machine solutions are well computable as the classical algorithms. Moreover, each of their reproducing bases includes information from multiple training data points. The concept of reproducing kernel Banach spaces offers us a new numerical tool for solving support vector machines.
Gregory E. Fasshauer, Qi Ye
In this paper we introduce a generalized Sobolev space by defining a semi-inner product formulated in terms of a vector distributional operator $\mathbf{P}$ consisting of finitely or countably many distributional operators $P_n$, which are defined on the dual space of the Schwartz space. The types of operators we consider include not only differential operators, but also more general distributional operators such as pseudo-differential operators. We deduce that a certain appropriate full-space Green function $G$ with respect to $L:=\mathbf{P}^{\ast T}\mathbf{P}$ now becomes a conditionally positive definite function. In order to support this claim we ensure that the distributional adjoint operator $\mathbf{P}^{\ast}$ of $\mathbf{P}$ is well-defined in the distributional sense. Under sufficient conditions, the native space (reproducing-kernel Hilbert space) associated with the Green function $G$ can be isometrically embedded into or even be isometrically equivalent to a generalized Sobolev space. As an application, we take linear combinations of translates of the Green function with possibly added polynomial terms and construct a multivariate minimum-norm interpolant $s_{f,X}$ to data values sampled from an unknown generalized Sobolev function $f$ at data sites located in some set $X \subset \mathbb{R}^d$. We provide several examples, such as Matérn kernels or Gaussian kernels, that illustrate how many reproducing-kernel Hilbert spaces of well-known reproducing kernels are isometrically equivalent to a generalized Sobolev space. These examples further illustrate how we can rescale the Sobolev spaces by the vector distributional operator $\mathbf{P}$. Introducing the notion of scale as part of the definition of a generalized Sobolev space may help us to choose the "best" kernel function for kernel-based approximation methods.
Xiaohan Jin, Ye Qi, Shangxuan Wu
Face-off is an interesting case of style transfer where the facial expressions and attributes of one person could be fully transformed to another face. We are interested in the unsupervised training process which only requires two sequences of unaligned video frames from each person and learns what shared attributes to extract automatically. In this project, we explored various improvements for adversarial training (i.e. CycleGAN[Zhu et al., 2017]) to capture details in facial expressions and head poses and thus generate transformation videos of higher consistency and stability.
Qi Ye, Tae-Kyun Kim
Learning and predicting the pose parameters of a 3D hand model given an image, such as locations of hand joints, is challenging due to large viewpoint changes and articulations, and severe self-occlusions exhibited particularly in egocentric views. Both feature learning and prediction modeling have been investigated to tackle the problem. Though effective, most existing discriminative methods yield a single deterministic estimation of target poses. Due to their single-value mapping intrinsic, they fail to adequately handle self-occlusion problems, where occluded joints present multiple modes. In this paper, we tackle the self-occlusion issue and provide a complete description of observed poses given an input depth image by a novel method called hierarchical mixture density networks (HMDN). The proposed method leverages the state-of-the-art hand pose estimators based on Convolutional Neural Networks to facilitate feature learning, while it models the multiple modes in a two-level hierarchy to reconcile single-valued and multi-valued mapping in its output. The whole framework with a mixture of two differentiable density functions is naturally end-to-end trainable. In the experiments, HMDN produces interpretable and diverse candidate samples, and significantly outperforms the state-of-the-art methods on two benchmarks with occlusions, and performs comparably on another benchmark free of occlusions.
Haoyuan Cai, Qi Ye, Dong-Ling Deng
Jul 19, 2021·quant-ph·PDF Quantum computers hold unprecedented potentials for machine learning applications. Here, we prove that physical quantum circuits are PAC (probably approximately correct) learnable on a quantum computer via empirical risk minimization: to learn a parametric quantum circuit with at most $n^c$ gates and each gate acting on a constant number of qubits, the sample complexity is bounded by $\tilde{O}(n^{c+1})$. In particular, we explicitly construct a family of variational quantum circuits with $O(n^{c+1})$ elementary gates arranged in a fixed pattern, which can represent all physical quantum circuits consisting of at most $n^c$ elementary gates. Our results provide a valuable guide for quantum machine learning in both theory and practice.