François Clément, Carola Doerr, Kathrin Klamroth, Luís Paquete
The $L_{\infty}$ star discrepancy is a very well-studied measure used to quantify the uniformity of a point set distribution. Constructing optimal point sets for this measure is seen as a very hard problem in the discrepancy community. Indeed, optimal point sets are, up to now, known only for $n\leq 6$ in dimension 2 and $n \leq 2$ for higher dimensions. We introduce in this paper mathematical programming formulations to construct point sets with as low $L_{\infty}$ star discrepancy as possible. Firstly, we present two models to construct optimal sets and show that there always exist optimal sets with the property that no two points share a coordinate. Then, we provide possible extensions of our models to other measures, such as the extreme and periodic discrepancies. For the $L_{\infty}$ star discrepancy, we are able to compute optimal point sets for up to 21 points in dimension 2 and for up to 8 points in dimension 3. For $d=2$ and $n\ge 7$ points, these point sets have around a 50% lower discrepancy than the current best point sets, and show a very different structure.
Hend Ben Ameur, François Clément, Pierre Weis, Guy Chavent
The estimation of distributed parameters in partial differential equations (PDE) from measures of the solution of the PDE may lead to under-determination problems. The choice of a parameterization is a usual way of adding a-priori information by reducing the number of unknowns according to the physics of the problem. The refinement indicators algorithm provides a fruitful adaptive parameterization technique that parsimoniously opens the degrees of freedom in an iterative way. We present a new general form of the refinement indicators algorithm that is applicable to the estimation of multi-dimensional parameters in any PDE. In the linear case, we state the relationship between the refinement indicator and the decrease of the usual least-squares data misfit objective function. We give numerical results in the simple case of the identity model, and this application reveals the refinement indicators algorithm as an image segmentation technique.
Alex Albors, François Clément, Shosuke Kiami, Braeden Sodt, Ding Yifan, Tony Zeng
Given an initial point $x_0 \in \mathbb{R}^d$ and a sequence of vectors $v_1, v_2, \dots$ in $\mathbb{R}^d$, we define a greedy sequence by setting $x_{n} = x_{n-1} \pm v_n$ where the sign is chosen so as to minimize $\|x_n\|$. We prove that if the vectors $v_i$ are chosen uniformly at random from $\mathbb{S}^{d-1}$ then elements of the sequence are, on average, approximately at distance $\|x_n\| \sim \sqrt{πd/8}$ from the origin. We show that the sequence $(\|x_n\|)_{n=1}^{\infty}$ has an invariant measure $π_d$ depending only on $d$ and we determine its mean and study its decay for all $d$. We also investigate a completely deterministic example in $d=2$ where the $v_n$ are derived from the van der Corput sequence. Several additional examples are considered.
François Clément, Carola Doerr, Kathrin Klamroth, Luís Paquete
Low discrepancy point sets have been widely used as a tool to approximate continuous objects by discrete ones in numerical processes, for example in numerical integration. Following a century of research on the topic, it is still unclear how low the discrepancy of point sets can go; in other words, how regularly distributed can points be in a given space. Recent insights using optimization and machine learning techniques have led to substantial improvements in the construction of low-discrepancy point sets, resulting in configurations of much lower discrepancy values than previously known. Building on the optimal constructions, we present a simple way to obtain $L_{\infty}$-optimized placement of points that follow the same relative order as an (arbitrary) input set. Applying this approach to point sets in dimensions 2 and 3 for up to 400 and 50 points, respectively, we obtain point sets whose $L_{\infty}$ star discrepancies are up to 25% smaller than those of the current-best sets, and around 50% better than classical constructions such as the Fibonacci set.
François Clément
The design of uniformly spread sequences on $[0,1)$ has been extensively studied since the work of Weyl and van der Corput in the early $20^{\text{th}}$ century. The current best sequences are based on the Kronecker sequence with golden ratio and a permutation of the van der Corput sequence by Ostromoukhov. Despite extensive efforts, it is still unclear if it is possible to improve these constructions further. We show, using numerical experiments, that a radically different approach introduced by Kritzinger in seems to perform better than the existing methods. In particular, this construction is based on a \emph{greedy} approach, and yet outperforms very delicate number-theoretic constructions. Furthermore, we are also able to provide the first numerical results in dimensions 2 and 3, and show that the sequence remains highly regular in this new setting.
François Clément, Dan Guyer
Motivated by trying to understand the behavior of the simplex method, Athanasiadis, De Loera and Zhang provided upper and lower bounds on the number of the monotone paths on 3-polytopes. For simple 3-polytopes with $2n$ vertices, they showed that the number of monotone paths is bounded above by $(1+\varphi)^n$, with $\varphi$ being the golden ratio. We improve the result and show that for a larger family of graphs the number is bounded above by $c \cdot 1.6779^n$ for some universal constant $c$. Meanwhile, the best known construction and conjectured extremizer is approximately $\varphi^n$.
Sylvie Boldo, François Clément, Jean-Christophe Filliâtre, Micaela Mayero, Guillaume Melquiond, Pierre Weis
Popular finite difference numerical schemes for the resolution of the one-dimensional acoustic wave equation are well-known to be convergent. We present a comprehensive formalization of the simplest one and formally prove its convergence in Coq. The main difficulties lie in the proper definition of asymptotic behaviors and the implicit way they are handled in the mathematical pen-and-paper proofs. To our knowledge, this is the first time such kind of mathematical proof is machine-checked.
François Clément
Kronecker sequences $(k α\mod 1)_{k=1}^{\infty}$ for some irrational $α> 0$ have played an important role in many areas of mathematics. It is possible to associate to each finite segment $(k α\mod 1)_{k=1}^{n}$ a permutation $π\in S_n$ associated with the canonical lifting to two dimensions. We show that these permutations induced by Kronecker sequences based on irrational $α$ are extremely regular for specific choices of $n$ and $α$. In particular, all quadratic irrationals have an infinite number of choices of $n$ that lead to permutations where no cycle has length more than 4.
François Clément, Carola Doerr, Luís Paquete
Building upon the exact methods presented in our earlier work [J. Complexity, 2022], we introduce a heuristic approach for the star discrepancy subset selection problem. The heuristic gradually improves the current-best subset by replacing one of its elements at a time. While we prove that the heuristic does not necessarily return an optimal solution, we obtain very promising results for all tested dimensions. For example, for moderate point set sizes $30 \leq n \leq 240$ in dimension 6, we obtain point sets with $L_{\infty}$ star discrepancy up to 35% better than that of the first $n$ points of the Sobol' sequence. Our heuristic works in all dimensions, the main limitation being the precision of the discrepancy calculation algorithms. We also provide a comparison with a recent energy functional introduced by Steinerberger [J. Complexity, 2019], showing that our heuristic performs better on all tested instances.
François Clément, Linhang Huang, Woorim Lee, Cole Smidt, Braeden Sodt, Xuan Zhang
The construction of low-discrepancy sets, used for uniform sampling and numerical integration, has recently seen great improvements based on optimization and machine learning techniques. However, these methods are computationally expensive, often requiring days of computation or access to GPU clusters. We show that simple gradient descent-based techniques allow for comparable results when starting with a reasonably uniform point set. Not only is this method much more efficient and accessible, but it can be applied as post-processing to any low-discrepancy set generation method for a variety of standard discrepancy measures.
Sylvie Boldo, François Clément, David Hamelin, Micaela Mayero, Pierre Rousselin
The goal of this contribution is to provide worksheets in Coq for students to learn about divisibility and binomials. These basic topics are a good case study as they are widely taught in the early academic years (or before in France). We present here our technical and pedagogical choices, the numerous exercises we developed and a small experiment we conducted on two students. As expected, it required additional Coq material such as other lemmas and dedicated tactics. The worksheets are freely available and flexible in several ways.
Deyao Chen, François Clément, Carola Doerr, Nathan Kirk
Kernel discrepancies are a powerful tool for analyzing worst-case errors in quasi-Monte Carlo (QMC) methods. Building on recent advances in optimizing such discrepancy measures, we extend the subset selection problem to the setting of kernel discrepancies, selecting an m-element subset from a large population of size $n \gg m$. We introduce a novel subset selection algorithm applicable to general kernel discrepancies to efficiently generate low-discrepancy samples from both the uniform distribution on the unit hypercube, the traditional setting of classical QMC, and from more general distributions $F$ with known density functions by employing the kernel Stein discrepancy. We also explore the relationship between the classical $L_2$ star discrepancy and its $L_\infty$ counterpart.
François Clément, Stefan Steinerberger
The k-means problem is perhaps the classical clustering problem and often synonymous with Lloyd's algorithm (1957). It has become clear that Hartigan's algorithm (1975) gives better results in almost all cases, Telgarsky-Vattani note a typical improvement of $5\%$ -- $10\%$. We point out that a very minor variation of Hartigan's method leads to another $2\%$ -- $5\%$ improvement; the improvement tends to become larger when either dimension or $k$ increase.
Sylvie Boldo, François Clément, Louise Leclerc
The Bochner integral is a generalization of the Lebesgue integral, for functions taking their values in a Banach space. Therefore, both its mathematical definition and its formalization in the Coq proof assistant are more challenging as we cannot rely on the properties of real numbers. Our contributions include an original formalization of simple functions, Bochner integrability defined by a dependent type, and the construction of the proof of the integrability of measurable functions under mild hypotheses (weak separability). Then, we define the Bochner integral and prove several theorems, including dominated convergence and the equivalence with an existing formalization of Lebesgue integral for nonnegative functions.
François Clément, Vincent Martin
To obtain the highest confidence on the correction of numerical simulation programs implementing the finite element method, one has to formalize the mathematical notions and results that allow to establish the soundness of the method. Sobolev spaces are the mathematical framework in which most weak formulations of partial derivative equations are stated, and where solutions are sought. These functional spaces are built on integration and measure theory. Hence, this chapter in functional analysis is a mandatory theoretical cornerstone for the definition of the finite element method. The purpose of this document is to provide the formal proof community with very detailed pen-and-paper proofs of the main results from integration and measure theory.
François Clèment, Carola Doerr, Luís Paquete
Motivated by applications in instance selection, we introduce the star discrepancy subset selection problem, which consists of finding a subset of m out of n points that minimizes the star discrepancy. First, we show that this problem is NP-hard. Then, we introduce a mixed integer linear formulation (MILP) and a combinatorial branch-and-bound (BB) algorithm for the star discrepancy subset selection problem and we evaluate both approaches against random subset selection and a greedy construction on different use-cases in dimension two and three. Our results show that the MILP and BB are efficient in dimension two for large and small $m/n$ ratio, respectively, and for not too large n. However, the performance of both approaches decays strongly for larger dimensions and set sizes. As a side effect of our empirical comparisons we obtain point sets of discrepancy values that are much smaller than those of common low-discrepancy sequences, random point sets, and of Latin Hypercube Sampling. This suggests that subset selection could be an interesting approach for generating point sets of small discrepancy value.
Sylvie Boldo, François Clément, Jean-Christophe Filliâtre, Micaela Mayero, Guillaume Melquiond, Pierre Weis
Computer programs may go wrong due to exceptional behaviors, out-of-bound array accesses, or simply coding errors. Thus, they cannot be blindly trusted. Scientific computing programs make no exception in that respect, and even bring specific accuracy issues due to their massive use of floating-point computations. Yet, it is uncommon to guarantee their correctness. Indeed, we had to extend existing methods and tools for proving the correct behavior of programs to verify an existing numerical analysis program. This C program implements the second-order centered finite difference explicit scheme for solving the 1D wave equation. In fact, we have gone much further as we have mechanically verified the convergence of the numerical scheme in order to get a complete formal proof covering all aspects from partial differential equations to actual numerical results. To the best of our knowledge, this is the first time such a comprehensive proof is achieved.
François Clément, Vincent Martin
To obtain the highest confidence on the correction of numerical simulation programs implementing the finite element method, one has to formalize the mathematical notions and results that allow to establish the soundness of the method. The Lax-Milgram theorem may be seen as one of those theoretical cornerstones: under some completeness and coercivity assumptions, it states existence and uniqueness of the solution to the weak formulation of some boundary value problems. The purpose of this document is to provide the formal proof community with a very detailed pen-and-paper proof of the Lax-Milgram theorem.
François Clément, Stefan Steinerberger
Consider an infinite sequence $(x_k)_{k=1}^{\infty}$ on the unit circle $\mathbb{S}^1$. We may interpret the first $n$ elements $(x_k)_{k=1}^{n}$ as places where the `circular stick' $\mathbb{S}^1$ is broken into a total of $n+1$ pieces. It is clear that they cannot all be the same length all the time. de Bruijn and Erdős (1949) show that the ratio of the largest to the smallest has to be arbitrarily close to 2 infinitely many times which is sharp. They also consider the problem of balancing the length of $r$ consecutive intervals and prove $$ \frac{\max \mbox{length of}~r~\mbox{consecutive intervals}}{\min \mbox{length of}~r~\mbox{consecutive intervals}} \geq 1 + \frac{1}{r}.$$ We prove that this ratio can be as small as $1 + c \log{r}/ r$. This is done by means of refined discrepancy estimates for the van der Corput sequence over very short intervals and proves a conjecture of Brethouwer.
Sylvie Boldo, François Clément, Vincent Martin, Micaela Mayero
Even if binary relations and orders are a common formalization topic, we need to formalize specific orders (namely monomial and graded) in the process of formalizing in Rocq the finite element method. This article is therefore definitions, operators, and proofs of properties about relations and orders, thus providing a comprehensive Rocq library. We especially focus on monomial orders, that are total orders compatible with the monoid operation. More than its definition and proved properties, we define several of them, among them the lexicographic and grevlex orders. For the sake of genericity, we formalize the grading of an order, a high-level operator that transforms a binary relation into another one, and we prove that grading an order preserves many of its properties, such as the monomial order property. This leads us to the definition and properties of four different graded orders, with very factorized proofs. We therefore provide a comprehensive and user-friendly library in Rocq about orders, including monomial and graded orders, that contains more than 700 lemmas.