Di Wang, Kazuro Furukawa, Masanori Satoh, Hiroshi Kaji, Hitoshi Sugimura, Yoshinori Enomoto, Fusashi Miyahara
A timing system provides high-precision signals to allow the controls over a variety of hardware and software components in the accelerator complex. This is guaranteed by the radio frequency (RF) and trigger signal synchronization for subsystems such as klystrons, pulsed magnets, and beam monitors. The main trigger signal should be distributed throughout the facility and repeated at the beam repetition rate. This trigger signal is usually generated by the same phase of an AC power line to follow the source of the fluctuation of an electrical grid and reduce the unwanted variation of the beam quality. To fulfill the needs of the multi-accelerator facility at KEK, apart from the normal trigger synchronization and bucket selection injection control, a beam operation scheme called the pulse-to-pulse modulation is utilized; hence, the complexity of the timing system increases. Uncertainty in the system and a trigger signal delivery error caused by a drastic AC power line drift are observed. Further, the effort to establish a reliable timing system at KEK and several solutions to improve the system robustness are presented.
Jinyan Su, Lijie Hu, Di Wang
In this paper, we revisit the problem of Differentially Private Stochastic Convex Optimization (DP-SCO) and provide excess population risks for some special classes of functions that are faster than the previous results of general convex and strongly convex functions. In the first part of the paper, we study the case where the population risk function satisfies the Tysbakov Noise Condition (TNC) with some parameter $θ>1$. Specifically, we first show that under some mild assumptions on the loss functions, there is an algorithm whose output could achieve an upper bound of $\tilde{O}((\frac{1}{\sqrt{n}}+\frac{\sqrt{d\log \frac{1}δ}}{nε})^\fracθ{θ-1})$ for $(ε, δ)$-DP when $θ\geq 2$, here $n$ is the sample size and $d$ is the dimension of the space. Then we address the inefficiency issue, improve the upper bounds by $\text{Poly}(\log n)$ factors and extend to the case where $θ\geq \barθ>1$ for some known $\barθ$. Next we show that the excess population risk of population functions satisfying TNC with parameter $θ\geq 2$ is always lower bounded by $Ω((\frac{d}{nε})^\fracθ{θ-1}) $ and $Ω((\frac{\sqrt{d\log \frac{1}δ}}{nε})^\fracθ{θ-1})$ for $ε$-DP and $(ε, δ)$-DP, respectively. In the second part, we focus on a special case where the population risk function is strongly convex. Unlike the previous studies, here we assume the loss function is {\em non-negative} and {\em the optimal value of population risk is sufficiently small}. With these additional assumptions, we propose a new method whose output could achieve an upper bound of $O(\frac{d\log\frac{1}δ}{n^2ε^2}+\frac{1}{n^τ})$ for any $τ\geq 1$ in $(ε,δ)$-DP model if the sample size $n$ is sufficiently large.
Jianghan Bao, Di Wang, Hai-Zhou Lu, Xiangang Wan
In perovskite-type compounds, the interplay of cooperative Jahn-Teller effect, electronic correlations and orbital degree of freedom leads to intriguing properties. $\mathrm{NaCrF_{3}}$ is a newly synthesized Jahn-Teller active fluoroperovskite where the $\mathrm{CrF_{6}^{4-}}$ octahedrons are considerably distorted. Based on the first-principles calculation, we analyze its electronic structure and magnetic properties. Our numerical results show that the $\mathrm{Cr^{2+}}$ ions adopt the high-spin $t_{2g\uparrow}^{3}e_{g\uparrow}^{1}$ configuration with $G$-type orbital ordering. We also estimate the magnetic exchange couplings and find that the in-plane and interplanar nearest-neighbor interactions are ferromagnetic and antiferromagnetic, respectively. The ground state of this material is $A$-type antiferromagnetic, in agreement with the experiments. Reasonable Curie-Weiss and $\mathrm{N\acute{e}el}$ temperatures compared to the experiments are given by mean-field approximation theory. Our results give a complete explanation of its electronic structure, magnetic and orbital order, and help to further comprehend the behaviors of Jahn-Teller active perovskite-type fluoride.
Di Wang, Wen Ye, Randall Sung, Hui Jiang, Jeremy M. G. Taylor, Lisa Ly, Kevin He
Prediction of time-to-event data often suffers from rare event rates, small sample sizes, high dimensionality and low signal-to-noise ratios. Incorporating published prediction models from large-scale studies is expected to improve the performance of prognosis prediction on internal individual-level time-to-event data. However, existing integration approaches typically assume that underlying distributions from the external and internal data sources are similar, which is often invalid. To account for challenges including heterogeneity, data sharing, and privacy constraints, we propose a discrete failure time modeling procedure, which utilizes a discrete hazard-based Kullback-Leibler discriminatory information measuring the discrepancy between the published models and the internal dataset. Simulations show the advantage of the proposed method compared with those solely based on the internal data or published models. We apply the proposed method to improve prediction performance on a kidney transplant dataset from a local hospital by integrating this small-scale dataset with published survival models obtained from the national transplant registry.
Di Wang, Bo Du, Liangpei Zhang
Convolutional neural networks have been widely applied to hyperspectral image classification. However, traditional convolutions can not effectively extract features for objects with irregular distributions. Recent methods attempt to address this issue by performing graph convolutions on spatial topologies, but fixed graph structures and local perceptions limit their performances. To tackle these problems, in this paper, different from previous approaches, we perform the superpixel generation on intermediate features during network training to adaptively produce homogeneous regions, obtain graph structures, and further generate spatial descriptors, which are served as graph nodes. Besides spatial objects, we also explore the graph relationships between channels by reasonably aggregating channels to generate spectral descriptors. The adjacent matrices in these graph convolutions are obtained by considering the relationships among all descriptors to realize global perceptions. By combining the extracted spatial and spectral graph features, we finally obtain a spectral-spatial graph reasoning network (SSGRN). The spatial and spectral parts of SSGRN are separately called spatial and spectral graph reasoning subnetworks. Comprehensive experiments on four public datasets demonstrate the competitiveness of the proposed methods compared with other state-of-the-art graph convolution-based approaches.
Di Wang
Flavor $SU(3)$ symmetry is a powerful tool to analyze charmed baryon decays. In this work, we propose an approach to generate $SU(3)$ sum rules for the singly and doubly charmed baryon decays without writing the Wigner-Eckhart invariants explicitly. The $SU(3)$ sum rules are computed routinely in several master formulas. Hundreds of $SU(3)$ sum rules are found to serve as test of the flavor symmetry in the charmed baryon decays.
Warren Siegel, Di Wang
F-theory is the theory proposed to incorporate superstring theory in a way such that STU dualities are manifest. A useful description uses a current superalgebra on a higher-dimensional worldvolume, following from an action for a selfdual gauge field. Here the group "metric" appearing in the Schwinger (central charge) term of this current superalgebra is generalized to a tensor, in analogy to the usual generalization of the structure constants to the torsion (and curvature). This allows introduction of a massless background describing F-supergravity on the original bosonic worldvolume. The isotropy group is represented on superspace, while the (exceptional) symmetry is represented on the worldvolume. As an example, we solve off shell the linearized superspace constraints of the massless sector of the F-theory that generalizes the N=2 supergravity (+ matter) of 3D S(tring)-theory, the corresponding manifestly T-dual theory of T-theory, and the N=1 supergravity of 4D M-theory. The results for the prepotential, its gauge transformation, and action agree with those that were derived previously without reference to the current algebra of the full F-theory.
Di Wang
Isospin symmetry is the most precise flavor symmetry. In this work, we propose an approach to generate isospin sum rules for heavy hadron decays without the Wigner-Eckhart invariants. The effective Hamiltonian of heavy quark weak decay is fully invariant under a series of isospin lowering operators $I_-^n$ and then the isospin sum rules can be generated through several master formulas. It provides a systematic way to study the isospin symmetry of $c$- and $b$-hadron weak decays. The theoretical framework of this approach is presented in detail with the nonleptonic decays of $D$ and $B$ mesons as examples. In addition, the $V$-/$U$-spin sum rules are derived in a similar algorithm by replacing $I_-^n$ with $V_-^n$/$U_-^n$.
Di Wang, Nicolas Honnorat, Peter T. Fox, Kerstin Ritter, Simon B. Eickhoff, Sudha Seshadri, Mohamad Habes
Deep neural networks currently provide the most advanced and accurate machine learning models to distinguish between structural MRI scans of subjects with Alzheimer's disease and healthy controls. Unfortunately, the subtle brain alterations captured by these models are difficult to interpret because of the complexity of these multi-layer and non-linear models. Several heatmap methods have been proposed to address this issue and analyze the imaging patterns extracted from the deep neural networks, but no quantitative comparison between these methods has been carried out so far. In this work, we explore these questions by deriving heatmaps from Convolutional Neural Networks (CNN) trained using T1 MRI scans of the ADNI data set, and by comparing these heatmaps with brain maps corresponding to Support Vector Machines (SVM) coefficients. Three prominent heatmap methods are studied: Layer-wise Relevance Propagation (LRP), Integrated Gradients (IG), and Guided Grad-CAM (GGC). Contrary to prior studies where the quality of heatmaps was visually or qualitatively assessed, we obtained precise quantitative measures by computing overlap with a ground-truth map from a large meta-analysis that combined 77 voxel-based morphometry (VBM) studies independently from ADNI. Our results indicate that all three heatmap methods were able to capture brain regions covering the meta-analysis map and achieved better results than SVM coefficients. Among them, IG produced the heatmaps with the best overlap with the independent meta-analysis.
Hanpu Shen, Cheng-Long Wang, Zihang Xiang, Yiming Ying, Di Wang
This paper focuses on the problem of Differentially Private Stochastic Optimization for (multi-layer) fully connected neural networks with a single output node. In the first part, we examine cases with no hidden nodes, specifically focusing on Generalized Linear Models (GLMs). We investigate the well-specific model where the random noise possesses a zero mean, and the link function is both bounded and Lipschitz continuous. We propose several algorithms and our analysis demonstrates the feasibility of achieving an excess population risk that remains invariant to the data dimension. We also delve into the scenario involving the ReLU link function, and our findings mirror those of the bounded link function. We conclude this section by contrasting well-specified and misspecified models, using ReLU regression as a representative example. In the second part of the paper, we extend our ideas to two-layer neural networks with sigmoid or ReLU activation functions in the well-specified model. In the third part, we study the theoretical guarantees of DP-SGD in Abadi et al. (2016) for fully connected multi-layer neural networks. By utilizing recent advances in Neural Tangent Kernel theory, we provide the first excess population risk when both the sample size and the width of the network are sufficiently large. Additionally, we discuss the role of some parameters in DP-SGD regarding their utility, both theoretically and empirically.
Wentao Jiang, Jing Zhang, Di Wang, Qiming Zhang, Zengmao Wang, Bo Du
Due to spatial redundancy in remote sensing images, sparse tokens containing rich information are usually involved in self-attention (SA) to reduce the overall token numbers within the calculation, avoiding the high computational cost issue in Vision Transformers. However, such methods usually obtain sparse tokens by hand-crafted or parallel-unfriendly designs, posing a challenge to reach a better balance between efficiency and performance. Different from them, this paper proposes to use learnable meta tokens to formulate sparse tokens, which effectively learn key information meanwhile improving the inference speed. Technically, the meta tokens are first initialized from image tokens via cross-attention. Then, we propose Dual Cross-Attention (DCA) to promote information exchange between image tokens and meta tokens, where they serve as query and key (value) tokens alternatively in a dual-branch structure, significantly reducing the computational complexity compared to self-attention. By employing DCA in the early stages with dense visual tokens, we obtain the hierarchical architecture LeMeViT with various sizes. Experimental results in classification and dense prediction tasks show that LeMeViT has a significant $1.7 \times$ speedup, fewer parameters, and competitive performance compared to the baseline models, and achieves a better trade-off between efficiency and performance.
Di Wang, Jinhui Xu
In this paper, we revisit the large-scale constrained linear regression problem and propose faster methods based on some recent developments in sketching and optimization. Our algorithms combine (accelerated) mini-batch SGD with a new method called two-step preconditioning to achieve an approximate solution with a time complexity lower than that of the state-of-the-art techniques for the low precision case. Our idea can also be extended to the high precision case, which gives an alternative implementation to the Iterative Hessian Sketch (IHS) method with significantly improved time complexity. Experiments on benchmark and synthetic datasets suggest that our methods indeed outperform existing ones considerably in both the low and high precision cases.
Neil Zhenqiang Gong, Di Wang
Recently, authenticating users with the help of their friends (i.e., trustee-based social authentication) has been shown to be a promising backup authentication mechanism. A user in this system is associated with a few trustees that were selected from the user's friends. When the user wants to regain access to the account, the service provider sends different verification codes to the user's trustees. The user must obtain at least k (i.e., recovery threshold) verification codes from the trustees before being directed to reset his or her password. In this paper, we provide the first systematic study about the security of trustee-based social authentications. Specifically, we first introduce a novel framework of attacks, which we call forest fire attacks. In these attacks, an attacker initially obtains a small number of compromised users, and then the attacker iteratively attacks the rest of users by exploiting trustee-based social authentications. Then, we construct a probabilistic model to formalize the threats of forest fire attacks and their costs for attackers. Moreover, we introduce various defense strategies. Finally, we apply our framework to extensively evaluate various concrete attack and defense strategies using three real-world social network datasets. Our results have strong implications for the design of more secure trustee-based social authentications.
Di Wang, Satish Rao, Michael W. Mahoney
The linear coupling method was introduced recently by Allen-Zhu and Orecchia for solving convex optimization problems with first order methods, and it provides a conceptually simple way to integrate a gradient descent step and mirror descent step in each iteration. The high-level approach of the linear coupling method is very flexible, and it has shown initial promise by providing improved algorithms for packing and covering linear programs. Somewhat surprisingly, however, while the dependence of the convergence rate on the error parameter $ε$ for packing problems was improved to $O(1/ε)$, which corresponds to what accelerated gradient methods are designed to achieve, the dependence for covering problems was only improved to $O(1/ε^{1.5})$, and even that required a different more complicated algorithm. Given the close connections between packing and covering problems and since previous algorithms for these very related problems have led to the same $ε$ dependence, this discrepancy is surprising, and it leaves open the question of the exact role that the linear coupling is playing in coordinating the complementary gradient and mirror descent step of the algorithm. In this paper, we clarify these issues for linear coupling algorithms for packing and covering linear programs, illustrating that the linear coupling method can lead to improved $O(1/ε)$ dependence for both packing and covering problems in a unified manner, i.e., with the same algorithm and almost identical analysis. Our main technical result is a novel diameter reduction method for covering problems that is of independent interest and that may be useful in applying the accelerated linear coupling method to other combinatorial problems.
Fengxiang Wang, Hongzhen Wang, Mingshuo Chen, Di Wang, Yulin Wang, Zonghao Guo, Qiang Ma, Long Lan, Wenjing Yang, Jing Zhang, Zhiyuan Liu, Maosong Sun
The astonishing breakthrough of multimodal large language models (MLLMs) has necessitated new benchmarks to quantitatively assess their capabilities, reveal their limitations, and indicate future research directions. However, this is challenging in the context of remote sensing (RS), since the imagery features ultra-high resolution that incorporates extremely complex semantic relationships. Existing benchmarks usually adopt notably smaller image sizes than real-world RS scenarios, suffer from limited annotation quality, and consider insufficient dimensions of evaluation. To address these issues, we present XLRS-Bench: a comprehensive benchmark for evaluating the perception and reasoning capabilities of MLLMs in ultra-high-resolution RS scenarios. XLRS-Bench boasts the largest average image size (8500$\times$8500) observed thus far, with all evaluation samples meticulously annotated manually, assisted by a novel semi-automatic captioner on ultra-high-resolution RS images. On top of the XLRS-Bench, 16 sub-tasks are defined to evaluate MLLMs' 10 kinds of perceptual capabilities and 6 kinds of reasoning capabilities, with a primary emphasis on advanced cognitive processes that facilitate real-world decision-making and the capture of spatiotemporal changes. The results of both general and RS-focused MLLMs on XLRS-Bench indicate that further efforts are needed for real-world RS applications. We have open-sourced XLRS-Bench to support further research in developing more powerful MLLMs for remote sensing.
Di Wang
Implementing LLM-integrated scripts introduces challenges in modularity and performance, as scripts are often coupled to specific LLM implementations and fail to exploit parallelization opportunities. This paper proposes using composable effect handling to separate workflow logic from effectful operations, such as LLM calls, I/O, and concurrency, enabling modularity without sacrificing the opportunity for performance optimization. By treating these operations as abstract interfaces and discharging them via effect handlers, this paper shows that scripts can achieve significant speedups (e.g., 10$\times$ in a Tree-of-Thoughts case study) without compromising modularity. This paper aims to promote composable effect handling as a programming style for LLM scripting.
Wei-Chen Fu, Si-Jia Wen, Di Wang
Isospin symmetry, as the most precise flavor symmetry, can be used to extract information about hadronic dynamics. The effective Hamiltonian operators of bottom quark weak decays are zero under a series of isospin lowering operators $I_-^n$, which permits us to generate isospin sum rules without the Wigner-Eckart invariants. In this work, we derive hundreds of isospin sum rules for the two- and three-body non-leptonic decays of bottom baryons. They provide hints for new decay modes and the isospin partners of pentaquark states.
Di. Wang, Yongjin. Li
The criterion for a point in the unit ball to be a strongly exposed point is given. The necessity and sufficiency conditions for Orlicz-Lorentz spaces to possess strongly exposed property are given. Besides, some useful methods are obtained to handle issues related to decreasing rearrangement.
Dongchen Si, Di Wang, Erzhong Gao, Xiaolei Qin, Liu Zhao, Jing Zhang, Minqiang Xu, Jianbo Zhan, Jianshe Wang, Lin Liu, Bo Du, Liangpei Zhang
Spectral information has long been recognized as a critical cue in remote sensing observations. Although numerous vision-language models have been developed for pixel-level interpretation, spectral information remains underutilized, resulting in suboptimal performance, particularly in multispectral scenarios. To address this limitation, we construct a vision-language instruction-following dataset named SPIE, which encodes spectral priors of land-cover objects into textual attributes recognizable by large language models (LLMs), based on classical spectral index computations. Leveraging this dataset, we propose SPEX, a multimodal LLM designed for instruction-driven land cover extraction. To this end, we introduce several carefully designed components and training strategies, including multiscale feature aggregation, token context condensation, and multispectral visual pre-training, to achieve precise and flexible pixel-level interpretation. To the best of our knowledge, SPEX is the first multimodal vision-language model dedicated to land cover extraction in spectral remote sensing imagery. Extensive experiments on five public multispectral datasets demonstrate that SPEX consistently outperforms existing state-of-the-art methods in extracting typical land cover categories such as vegetation, buildings, and water bodies. Moreover, SPEX is capable of generating textual explanations for its predictions, thereby enhancing interpretability and user-friendliness. Code will be released at: https://github.com/MiliLab/SPEX.
Haoyang Chen, Jing Zhang, Hebaixu Wang, Shiqin Wang, Pohsun Huang, Jiayuan Li, Haonan Guo, Di Wang, Zheng Wang, Bo Du
Multi-modal remote sensing imagery provides complementary observations of the same geographic scene, yet such observations are frequently incomplete in practice. Existing cross-modal translation methods treat each modality pair as an independent task, resulting in quadratic complexity and limited generalization to unseen modality combinations. We formulate Any-to-Any translation as inference over a shared latent representation of the scene, where different modalities correspond to partial observations of the same underlying semantics. Based on this formulation, we propose Any2Any, a unified latent diffusion framework that projects heterogeneous inputs into a geometrically aligned latent space. Such structure performs anchored latent regression with a shared backbone, decoupling modality-specific representation learning from semantic mapping. Moreover, lightweight target-specific residual adapters are used to correct systematic latent mismatches without increasing inference complexity. To support learning under sparse but connected supervision, we introduce RST-1M, the first million-scale remote sensing dataset with paired observations across five sensing modalities, providing supervision anchors for any-to-any translation. Experiments across 14 translation tasks show that Any2Any consistently outperforms pairwise translation methods and exhibits strong zero-shot generalization to unseen modality pairs. Code and models will be available at https://github.com/MiliLab/Any2Any.