Zhe Chang, Pranati K. Rath, Yu Sang, Dong Zhao, Yong Zhou
Jun 29, 2018·astro-ph.CO·PDF We propose a framework of Finsler space-time to explain the observed parity asymmetry and the power deficit in the low-$\ell$ ($2\leqslant \ell \leqslant 29$) multipole range of cosmic microwave background (CMB) temperature anisotropies. In the $3+1$ dimensional space-time, the three-dimensional space is described by a Randers-Finsler space, which is spatially irreversible, inducing the parity asymmetry in the CMB angular power spectrum. We estimate the constraints on the two parameters introduced by Finsler space-time via analyzing the low-$\ell$ angular power spectrum in PLANCK 2015 CMB temperature data. We see that the low-$\ell$ power suppression in the CMB temperature anisotropies can also be resolved in this scenario. Our study shows that the two low-$\ell$ anomalies, i.e., parity asymmetry and power deficit, may have a common origin.
Shuang Wang, Dong Zhao, Yi Li, Chi Zhang, Yuwei Guo, Qi Zang, Biao Hou, Licheng Jiao
Feature alignment between domains is one of the mainstream methods for Unsupervised Domain Adaptation (UDA) semantic segmentation. Existing feature alignment methods for semantic segmentation learn domain-invariant features by adversarial training to reduce domain discrepancy, but they have two limits: 1) associations among pixels are not maintained, 2) the classifier trained on the source domain couldn't adapted well to the target. In this paper, we propose a new UDA semantic segmentation approach based on domain closeness assumption to alleviate the above problems. Specifically, a prototype clustering strategy is applied to cluster pixels with the same semantic, which will better maintain associations among target domain pixels during the feature alignment. After clustering, to make the classifier more adaptive, a normalized cut loss based on the affinity graph of the target domain is utilized, which will make the decision boundary target-specific. Sufficient experiments conducted on GTA5 $\rightarrow$ Cityscapes and SYNTHIA $\rightarrow$ Cityscapes proved the effectiveness of our method, which illustrated that our results achieved the new state-of-the-art.
Dong Zhao, Adriana-Simona Mihaita, Yuming Ou, Sajjad Shafiei, Hanna Grzybowska, A. K. Qin, Gary Tan, Mo Li, Hussein Dia
A multi-modal transport system is acknowledged to have robust failure tolerance and can effectively relieve urban congestion issues. However, estimating the impact of disruptions across multi-transport modes is a challenging problem due to a dis-aggregated modelling approach applied to only individual modes at a time. To fill this gap, this paper proposes a new integrated modelling framework for a multi-modal traffic state estimation and evaluation of the disruption impact across all modes under various traffic conditions. First, we propose an iterative trip assignment model to elucidate the association between travel demand and travel behaviour, including a multi-modal origin-to-destination estimation for private and public transport. Secondly, we provide a practical multi-modal travel demand re-adjustment that takes the mode shift of the affected travellers into consideration. The pros and cons of the mode shift strategy are showcased via several scenario-based transport simulating experiments. The results show that a well-balanced mode shift with flexible routing and early announcements of detours so that travellers can plan ahead can significantly benefit all travellers by a delay time reduction of 46%, while a stable route assignment maintains a higher average traffic flow and the inactive mode-route choice help relief density under the traffic disruptions.
Dong Zhao, Shuang Wang, Qi Zang, Licheng Jiao, Nicu Sebe, Zhun Zhong
We study source-free unsupervised domain adaptation (SFUDA) for semantic segmentation, which aims to adapt a source-trained model to the target domain without accessing the source data. Many works have been proposed to address this challenging problem, among which uncertainty-based self-training is a predominant approach. However, without comprehensive denoising mechanisms, they still largely fall into biased estimates when dealing with different domains and confirmation bias. In this paper, we observe that pseudo-label noise is mainly contained in unstable samples in which the predictions of most pixels undergo significant variations during self-training. Inspired by this, we propose a novel mechanism to denoise unstable samples with stable ones. Specifically, we introduce the Stable Neighbor Denoising (SND) approach, which effectively discovers highly correlated stable and unstable samples by nearest neighbor retrieval and guides the reliable optimization of unstable samples by bi-level learning. Moreover, we compensate for the stable set by object-level object paste, which can further eliminate the bias caused by less learned classes. Our SND enjoys two advantages. First, SND does not require a specific segmentor structure, endowing its universality. Second, SND simultaneously addresses the issues of class, domain, and confirmation biases during adaptation, ensuring its effectiveness. Extensive experiments show that SND consistently outperforms state-of-the-art methods in various SFUDA semantic segmentation settings. In addition, SND can be easily integrated with other approaches, obtaining further improvements.
Dong Zhao, Qi Zang, Nan Pu, Wenjing Li, Nicu Sebe, Zhun Zhong
Domain Generalization in Semantic Segmentation (DG-SS) aims to enable segmentation models to perform robustly in unseen environments. However, conventional DG-SS methods are restricted to a fixed set of known categories, limiting their applicability in open-world scenarios. Recent progress in Vision-Language Models (VLMs) has advanced Open-Vocabulary Semantic Segmentation (OV-SS) by enabling models to recognize a broader range of concepts. Yet, these models remain sensitive to domain shifts and struggle to maintain robustness when deployed in unseen environments, a challenge that is particularly severe in urban-driving scenarios. To bridge this gap, we introduce Open-Vocabulary Domain Generalization in Semantic Segmentation (OVDG-SS), a new setting that jointly addresses unseen domains and unseen categories. We introduce the first benchmark for OVDG-SS in autonomous driving, addressing a previously unexplored problem and covering both synthetic-to-real and real-to-real generalization across diverse unseen domains and unseen categories. In OVDG-SS, we observe that domain shifts often distort text-image correlations in pre-trained VLMs, which hinders the performance of OV-SS models. To tackle this challenge, we propose S2-Corr, a state-space-driven text-image correlation refinement mechanism that mitigates domain-induced distortions and produces more consistent text-image correlations under distribution changes. Extensive experiments on our constructed benchmark demonstrate that the proposed method achieves superior cross-domain performance and efficiency compared to existing OV-SS approaches.
Dong Zhao, Jun-Qing Xia
We test the anisotropy in the Finslerian cosmological model with the X-ray and ultraviolet (UV) fluxes of 808 quasars. The dipole amplitude is $A_D=0.302_{ -0.124}^{ +0.185}$ and the dipole direction points towards $(l, b) = ( 288.92_{~ -28.80^{\circ}}^{^{\circ}+23.74^{\circ}}, 6.10_{~ -16.40^{\circ}}^{^{\circ} +16.55^{\circ}} )$. We find that the dipole direction from the X-ray and UV fluxes of quasars is very close to the dipole direction given by the "Joint Light-curve Analysis" (JLA) compilation in the Finslerian cosmological model and the angular difference between the two dipole directions is only $10.44^{\circ}$. We also find the angular difference between the dipole direction from the 808 quasars in the Finslerian cosmological model and ones from the supernovae of type Ia (SNe Ia) samples in the dipole-modulated $Λ$CDM model is around $30^{\circ}$. Six gravitationally lensed quasars are considered to investigate the Hubble constant $H_0$ in the Finslerian cosmological model. We get a slightly smaller $H_0$ than the result given by the six gravitationally lensed quasars. Finally, we forecast the future constraints on the dipole parameters with the X-ray and UV fluxes of quasars. As the number of simulations increases, the precisions of the parameters related to anisotropy in the Finslerian cosmological model improve significantly. The X-ray and UV fluxes of quasars have a promising future as a probe of anisotropy in Finsler spacetime.
Dong Zhao, Yang Shi, Steven X. Ding, Yueyang Li, Fangzhou Fu
The replay attack detection problem is studied from a new perspective based on parity space method in this paper. The proposed detection methods have the ability to distinguish system fault and replay attack, handle both input and output data replay, maintain certain control performance, and can be implemented conveniently and efficiently. First, the replay attack effect on the residual is derived and analyzed. The residual change induced by replay attack is characterized explicitly and the detection performance analysis based on two different test statistics are given. Second, based on the replay attack effect characterization, targeted passive and active design for detection performance enhancement are proposed. Regarding the passive design, four optimization schemes regarding different cost functions are proposed with optimal parity matrix solutions, and the unified solution to the passive optimization schemes is obtained; the active design is enabled by a marginally stable filter so as to enlarge the replay attack effect on the residual for detection. Simulations and comparison studies are given to show the effectiveness of the proposed methods.
Dong Zhao
A unified solution to adaptive approximation-based control for nonlinear systems with accurate and inaccurate state measurement is synthesized in this study. Starting from the standard adaptive approximation-based controller with accurate state measurement, its corresponding physical interpretation, stability conclusion, and learning ability are rigorously addressed when facing additive measurement inaccuracy, and explicit answers are obtained in the framework of both controller matching and system matching. Finally, it proves that, with a certain condition, the standard adaptive approximation-based controller works as a unified solution for the cases with accurate and inaccurate measurement, and the solution can be extended to the nonlinear system control problems with extra unknown dynamics or faults in actuator and/or process dynamics. A single-link robot arm example is used for the simulation demonstration of the unified solution.
Dong Zhao, Ruizhi Yang, Shuang Wang, Qi Zang, Yang Hu, Licheng Jiao, Nicu Sebe, Zhun Zhong
Presently, self-training stands as a prevailing approach in cross-domain semantic segmentation, enhancing model efficacy by training with pixels assigned with reliable pseudo-labels. However, we find two critical limitations in this paradigm. (1) The majority of reliable pixels exhibit a speckle-shaped pattern and are primarily located in the central semantic region. This presents challenges for the model in accurately learning semantics. (2) Category noise in speckle pixels is difficult to locate and correct, leading to error accumulation in self-training. To address these limitations, we propose a novel approach called Semantic Connectivity-driven pseudo-labeling (SeCo). This approach formulates pseudo-labels at the connectivity level and thus can facilitate learning structured and low-noise semantics. Specifically, SeCo comprises two key components: Pixel Semantic Aggregation (PSA) and Semantic Connectivity Correction (SCC). Initially, PSA divides semantics into 'stuff' and 'things' categories and aggregates speckled pseudo-labels into semantic connectivity through efficient interaction with the Segment Anything Model (SAM). This enables us not only to obtain accurate boundaries but also simplifies noise localization. Subsequently, SCC introduces a simple connectivity classification task, which enables locating and correcting connectivity noise with the guidance of loss distribution. Extensive experiments demonstrate that SeCo can be flexibly applied to various cross-domain semantic segmentation tasks, including traditional unsupervised, source-free, and black-box domain adaptation, significantly improving the performance of existing state-of-the-art methods. The code is available at https://github.com/DZhaoXd/SeCo.
Wenjun Li, Shudong Wang, Dong Zhao, Shenghui Xu, Zhaoming Pan, Zhimin Zhang
The key of the text-to-video retrieval (TVR) task lies in learning the unique similarity between each pair of text (consisting of words) and video (consisting of audio and image frames) representations. However, some problems exist in the representation alignment of video and text, such as a text, and further each word, are of different importance for video frames. Besides, audio usually carries additional or critical information for TVR in the case that frames carry little valid information. Therefore, in TVR task, multi-granularity representation of text, including whole sentence and every word, and the modal of audio are salutary which are underutilized in most existing works. To address this, we propose a novel multi-granularity feature interaction module called MGFI, consisting of text-frame and word-frame, for video-text representations alignment. Moreover, we introduce a cross-modal feature interaction module of audio and text called CMFI to solve the problem of insufficient expression of frames in the video. Experiments on benchmark datasets such as MSR-VTT, MSVD, DiDeMo show that the proposed method outperforms the existing state-of-the-art methods.
Dong Zhao, Huadong Ma, Liang Liu
Mobile Crowd Sensing (MCS) is a new paradigm which takes advantage of pervasive smartphones to efficiently collect data, enabling numerous novel applications. To achieve good service quality for a MCS application, incentive mechanisms are necessary to attract more user participation. Most of existing mechanisms apply only for the offline scenario where all users' information are known a priori. On the contrary, we focus on a more realistic scenario where users arrive one by one online in a random order. Based on the online auction model, we investigate the problem that users submit their private profiles to the crowdsourcer when they arrive, and the crowdsourcer aims at selecting a subset of users before a specified deadline for minimizing the total payment while a specific number of tasks can be completed.We design three online mechanisms, Homo-OMZ, Hetero-OMZ and Hetero-OMG, all of which can satisfy the computational efficiency, individual rationality, cost-truthfulness, and consumer sovereignty. The Homo-OMZ mechanism is applicable to the homogeneous user model and can satisfy the social efficiency but not constant frugality. The Hetero-OMZ and Hetero-OMG mechanisms are applicable to both the homogeneous and heterogeneous user models, and can satisfy the constant frugality. Besides, the Hetero-OMG mechanism can also satisfy the time-truthfulness. Through extensive simulations, we evaluate the performance and validate the theoretical properties of our online mechanisms.
Qi Zang, Shuang Wang, Dong Zhao, Dou Quan, Yang Hu, Licheng Jiao
Change detection has essential significance for the region's development, in which pseudo-changes between bitemporal images induced by imaging environmental factors are key challenges. Existing transformation-based methods regard pseudo-changes as a kind of style shift and alleviate it by transforming bitemporal images into the same style using generative adversarial networks (GANs). However, their efforts are limited by two drawbacks: 1) Transformed images suffer from distortion that reduces feature discrimination. 2) Alignment hampers the model from learning domain-agnostic representations that degrades performance on scenes with domain shifts from the training data. Therefore, oriented from pseudo-changes caused by style differences, we present a generalizable domain-agnostic difference learning network (DonaNet). For the drawback 1), we argue for local-level statistics as style proxies to assist against domain shifts. For the drawback 2), DonaNet learns domain-agnostic representations by removing domain-specific style of encoded features and highlighting the class characteristics of objects. In the removal, we propose a domain difference removal module to reduce feature variance while preserving discriminative properties and propose its enhanced version to provide possibilities for eliminating more style by decorrelating the correlation between features. In the highlighting, we propose a cross-temporal generalization learning strategy to imitate latent domain shifts, thus enabling the model to extract feature representations more robust to shifts actively. Extensive experiments conducted on three public datasets demonstrate that DonaNet outperforms existing state-of-the-art methods with a smaller model size and is more robust to domain shift.
Dong Zhao, Jinlong Li, Shuang Wang, Mengyao Wu, Qi Zang, Nicu Sebe, Zhun Zhong
Vision Foundation Models (VFMs) excel in generalization due to large-scale pretraining, but fine-tuning them for Domain Generalized Semantic Segmentation (DGSS) while maintaining this ability remains challenging. Existing approaches either selectively fine-tune parameters or freeze the VFMs and update only the adapters, both of which may underutilize the VFMs' full potential in DGSS tasks. We observe that domain-sensitive parameters in VFMs, arising from task and distribution differences, can hinder generalization. To address this, we propose \textbf{FisherTune}, a robust fine-tuning method guided by the Domain-Related Fisher Information Matrix (DR-FIM). DR-FIM measures parameter sensitivity across tasks and domains, enabling selective updates that preserve generalization and enhance DGSS adaptability. FisherTune incorporates variational inference to stabilize DR-FIM estimation, treating parameters as Gaussian-distributed variables and leveraging pre-trained priors. Extensive experiments show that FisherTune achieves superior cross-domain segmentation while maintaining generalization, outperforming selective-parameter and adapter-based methods.
Zhao Dong, Ka Chen, Zhaoyang Lv, Hong-Xing Yu, Yunzhi Zhang, Cheng Zhang, Yufeng Zhu, Stephen Tian, Zhengqin Li, Geordie Moffatt, Sean Christofferson, James Fort, Xiaqing Pan, Mingfei Yan, Jiajun Wu, Carl Yuheng Ren, Richard Newcombe
We introduce the Digital Twin Catalog (DTC), a new large-scale photorealistic 3D object digital twin dataset. A digital twin of a 3D object is a highly detailed, virtually indistinguishable representation of a physical object, accurately capturing its shape, appearance, physical properties, and other attributes. Recent advances in neural-based 3D reconstruction and inverse rendering have significantly improved the quality of 3D object reconstruction. Despite these advancements, there remains a lack of a large-scale, digital twin-quality real-world dataset and benchmark that can quantitatively assess and compare the performance of different reconstruction methods, as well as improve reconstruction quality through training or fine-tuning. Moreover, to democratize 3D digital twin creation, it is essential to integrate creation techniques with next-generation egocentric computing platforms, such as AR glasses. Currently, there is no dataset available to evaluate 3D object reconstruction using egocentric captured images. To address these gaps, the DTC dataset features 2,000 scanned digital twin-quality 3D objects, along with image sequences captured under different lighting conditions using DSLR cameras and egocentric AR glasses. This dataset establishes the first comprehensive real-world evaluation benchmark for 3D digital twin creation tasks, offering a robust foundation for comparing and improving existing reconstruction methods. The DTC dataset is already released at https://www.projectaria.com/datasets/dtc/ and we will also make the baseline evaluations open-source.
Dong Zhao, Xiang-Yang Li, Huadong Ma
Mobile crowdsourced sensing (MCS) is a new paradigm which takes advantage of the pervasive smartphones to efficiently collect data, enabling numerous novel applications. To achieve good service quality for a MCS application, incentive mechanisms are necessary to attract more user participation. Most of existing mechanisms apply only for the offline scenario where all users' information are known a priori. On the contrary, we focus on a more real scenario where users arrive one by one online in a random order. We model the problem as an online auction in which the users submit their private types to the crowdsourcer over time, and the crowdsourcer aims to select a subset of users before a specified deadline for maximizing the total value of the services provided by selected users under a budget constraint. We design two online mechanisms, OMZ and OMG, satisfying the computational efficiency, individual rationality, budget feasibility, truthfulness, consumer sovereignty and constant competitiveness under the zero arrival-departure interval case and a more general case, respectively. Through extensive simulations, we evaluate the performance and validate the theoretical properties of our online mechanisms.
Qi Zang, Jiayi Yang, Shuang Wang, Dong Zhao, Wenjun Yi, Zhun Zhong
Data-driven deep learning models have enabled tremendous progress in change detection (CD) with the support of pixel-level annotations. However, collecting diverse data and manually annotating them is costly, laborious, and knowledge-intensive. Existing generative methods for CD data synthesis show competitive potential in addressing this issue but still face the following limitations: 1) difficulty in flexibly controlling change events, 2) dependence on additional data to train the data generators, 3) focus on specific change detection tasks. To this end, this paper focuses on the semantic CD (SCD) task and develops a multi-temporal SCD data generator ChangeDiff by exploring powerful diffusion models. ChangeDiff innovatively generates change data in two steps: first, it uses text prompts and a text-to-layout (T2L) model to create continuous layouts, and then it employs layout-to-image (L2I) to convert these layouts into images. Specifically, we propose multi-class distribution-guided text prompts (MCDG-TP), allowing for layouts to be generated flexibly through controllable classes and their corresponding ratios. Subsequently, to generalize the T2L model to the proposed MCDG-TP, a class distribution refinement loss is further designed as training supervision. %For the former, a multi-classdistribution-guided text prompt (MCDG-TP) is proposed to complement via controllable classes and ratios. To generalize the text-to-image diffusion model to the proposed MCDG-TP, a class distribution refinement loss is designed as training supervision. For the latter, MCDG-TP in three modes is proposed to synthesize new layout masks from various texts. Our generated data shows significant progress in temporal continuity, spatial diversity, and quality realism, empowering change detectors with accuracy and transferability. The code is available at https://github.com/DZhaoXd/ChangeDiff
Youqing Wang, Ying Li, Thomas Parisini, Dong Zhao
We address a distributed adaptive control methodology for nonlinear interconnected systems possibly affected by network anomalies. In the framework of adaptive approximation, the distributed controller and parameter estimator are designed by exploiting a backstepping approach. The stability of the distributed control system under anomalies is analyzed, where both local and neighboring anomaly effects are considered. To quantify the resilience of the interconnected system under the action of network anomalies, we derive bounds on the duration of each anomaly and the resting time between two consecutive anomalies. Specifically, when each anomaly duration is smaller than our designed upper bound, the interconnected system controlled by the distributed approximation-based controller remains asymptotically stable. Moreover, if the resting time between two consecutive anomalies is larger than the proposed bound, then all signals of the control system are guaranteed to be bounded. In the paper, we show that under the action of the proposed distributed adaptive controller, the interconnected system remains stable in the presence of network anomalies, with both the qualitative and quantitative resilient conditions. Extensive simulation results show the effectiveness of our theoretical results.
Dong Zhao, Zhi-Chao Zhao, Jun-Qing Xia
Jun 17, 2020·astro-ph.CO·PDF We propose a method based on the process of extracting gravitational wave (GW) parameters from GW signals to simulate the binary neutron-star (BNS) merging events. We simulate 1000 GW standard sirens based on the observation of the Einstein Telescope (ET). Almost all the simulated GW data are in the redshift range of $[0,3]$. The role of the GW standard siren in the inference of the cosmological parameters is investigated. We find that the GW data can help improve the accuracy of cosmological parameters. Moreover, the degeneracy of cosmological parameters is broken by the GW data. The GW standard siren is helpful for the constraint of the cosmological parameters.
Dong Zhao, Huadong Ma, Xinna Ji
Incentive mechanism design has aroused extensive attention for crowdsourcing applications in recent years. Most research assumes that participants are already in the system and aware of the existence of crowdsourcing tasks. Whereas in real life scenarios without this assumption, it is a more effective way to leverage incentive tree mechanisms that incentivize both users' direct contributions and solicitations to other users. Although some such mechanisms have been investigated, we are the first to propose budget-consistent incentive tree mechanisms, called generalized lottrees, which require the total payout to all participants to be consistent with the announced budget, while guaranteeing several other desirable properties including continuing contribution incentive, continuing solicitation incentive, value proportional to contribution, unprofitable solicitor bypassing, and unprofitable sybil attack. Moreover, we present three types of generalized lottree mechanisms, 1-Pachira, K-Pachira, and Sharing-Pachira, which support more diversified requirements. A solid theoretical guidance to the mechanism selection is provided as well based on the Cumulative Prospect Theory. Both extensive simulations and realistic experiments with 82 users have been conducted to confirm our theoretical analysis.
Dong Zhao, Ming-Hua Li, Ping Wang, Zhe Chang
Dec 16, 2014·astro-ph.CO·PDF The influence of cosmological constant type dark energy in the early universe is investigated. This is accommodated by a new dispersion relation in de Sitter spacetime. We perform a global fitting to explore the cosmological parameters space by using the CosmoMC package with the recently released Planck TT and WMAP Polarization datasets. Using the results from global fitting, we compute a new CMB temperature-temperature spectrum. The obtained TT spectrum has lower power compared with the one based on $Λ$CDM model at large scales.