Yan Sun, Alvin Tang, Ching-Hua Wang, Yanqing Zhao, Mengmeng Bai, Shuting Xu, Zheqi Xu, Tao Tang, Sheng Wang, Chenguang Qiu, Kang Xu, Xubiao Peng, Junfeng Han, Eric Pop, Yang Chai, Yao Guo
The inferior electrical contact to two-dimensional (2D) materials is a critical challenge for their application in post-silicon very large-scale integrated circuits. Electrical contacts were generally related to their resistive effect, quantified as contact resistance. With a systematic investigation, this work demonstrates a capacitive metal-insulator-semiconductor (MIS) field-effect at the electrical contacts to 2D materials: the field-effect depletes or accumulates charge carriers, redistributes the voltage potential, and give rise to abnormal current saturation and nonlinearity. On the one hand, the current saturation hinders the devices' driving ability, which can be eliminated with carefully engineered contact configurations. On the other hand, by introducing the nonlinearity to monolithic analog artificial neural network circuits, the circuits' perception ability can be significantly enhanced, as evidenced using a COVID-19 critical illness prediction model. This work provides a comprehension of the field-effect at the electrical contacts to 2D materials, which is fundamental to the design, simulation, and fabrication of electronics based on 2D material.
Yao Guo, Yang Ai, Rui-Chen Zheng, Hui-Peng Du, Xiao-Hang Jiang, Zhen-Hua Ling
This paper proposes a novel vision-integrated neural speech codec (VNSC), which aims to enhance speech coding quality by leveraging visual modality information. In VNSC, the image analysis-synthesis module extracts visual features from lip images, while the feature fusion module facilitates interaction between the image analysis-synthesis module and the speech coding module, transmitting visual information to assist the speech coding process. Depending on whether visual information is available during the inference stage, the feature fusion module integrates visual features into the speech coding module using either explicit integration or implicit distillation strategies. Experimental results confirm that integrating visual information effectively improves the quality of the decoded speech and enhances the noise robustness of the neural speech codec, without increasing the bitrate.
Qing Song, Yao Guo, Jianan Jiang, Chun Liu, Mengjie Hu
Railway transportation is the artery of China's national economy and plays an important role in the development of today's society. Due to the late start of China's railway security inspection technology, the current railway security inspection tasks mainly rely on manual inspection, but the manual inspection efficiency is low, and a lot of manpower and material resources are needed. In this paper, we establish a steel rail fastener detection image dataset, which contains 4,000 rail fastener pictures about 4 types. We use the regional suggestion network to generate the region of interest, extracts the features using the convolutional neural network, and fuses the classifier into the detection network. With online hard sample mining to improve the accuracy of the model, we optimize the Faster RCNN detection framework by reducing the number of regions of interest. Finally, the model accuracy reaches 99% and the speed reaches 35FPS in the deployment environment of TITAN X GPU.
Lin Yan, Yao Guo, Xiangqun Chen, Hong Mei
Power side channel is a very important category of side channels, which can be exploited to steal confidential information from a computing system by analyzing its power consumption. In this paper, we demonstrate the existence of various power side channels on popular mobile devices such as smartphones. Based on unprivileged power consumption traces, we present a list of real-world attacks that can be initiated to identify running apps, infer sensitive UIs, guess password lengths, and estimate geo-locations. These attack examples demonstrate that power consumption traces can be used as a practical side channel to gain various confidential information of mobile apps running on smartphones. Based on these power side channels, we discuss possible exploitations and present a general approach to exploit a power side channel on an Android smartphone, which demonstrates that power side channels pose imminent threats to the security and privacy of mobile users. We also discuss possible countermeasures to mitigate the threats of power side channels.
Yao Guo, Feng Liu, Aihong Tang
The directed flow of inclusive, transported and non-transported (including produced) protons, as well as antiprotons, has been studied in the framework of Ultra-Relativistic Quantum Molecular Dynamics approach (UrQMD model) for Au+Au collisions at\surdsNN =7.7, 11.5, 19.6, 27, 39, 62.4 and 200 GeV. The rapidity, centrality and energy dependence of directed flow for various proton groups are presented. It is found that the integrated directed flow decreases monotonically as a function of collision energy for\surdsNN =11.5 GeV and beyond. However, the sign-change of directed flow of inclusive protons, seen in experimental data as a function of centrality and collision energy, can be explained by the competing effect of directed flow between transported and non-transported protons. Similarly the difference in directed flow between protons and antiprotons can be explained. Our study offers a conventional explanation on the cause of the v1 sign-change other than the antiflow component of protons alone which is argued to be linked to a phase transition.
Yao Guo, Wenjie Zhong, Yiqiu Ma, Daiqin Su
Ultralight boson is one of the potential candidates for dark matter. If exists, it can be generated by a rapidly rotating black hole via superradiance, extracting the energy and angular momentum of the black hole and forming a boson cloud. The boson cloud can be affected by the presence of a companion star, generating fruitful dynamical effects and producing characteristic gravitational wave signals. We study the dynamics of the boson cloud in a binary black hole system, in particular, we develop a framework to study the mass transfer between two black holes. It is found that bosons occupying the growing modes of the central black hole can jump to the decaying modes of the companion black hole, resulting in cloud depletion. This mechanism of cloud depletion is different from that induced by the resonant perturbation from the companion.
Mengmeng Bai, Yanqing Zhao, Shuting Xu, Yao Guo
Geometric diodes, which take advantage of geometric asymmetry to achieve current flow preference, are promising for THz current rectification. Previous studies relate geometric diodes' rectification to quantum coherent or ballistic transport, which is fragile and critical of the high-quality transport system. Here we propose a different physical picture and demonstrate a robust current rectification originating from the asymmetric bias induced barrier lowering, which generally applies to common semiconductors in normal environments. Key factors to the diode's performance are carefully analyzed, and an intrinsic rectification ability at up to 1.1 THz is demonstrated.
Yao Guo, Junyang Lu
Multi-core processors are becoming more and more popular in embedded and real-time systems. While fixed-priority scheduling with task-splitting in real-time systems are widely applied, current approaches have not taken into consideration energy-aware aspects such as dynamic voltage/frequency scheduling (DVS). In this paper, we propose two strategies to apply dynamic voltage scaling (DVS) to fixed-priority scheduling algorithms with task-splitting for periodic real-time tasks on multi-core processors. The first strategy determines voltage scales for each processor after scheduling (Static DVS), which ensures all tasks meet the timing requirements on synchronization. The second strategy adaptively determines the frequency of each task before scheduling (Adaptive DVS) according to the total utilization of task-set and number of cores available. The combination of frequency pre-allocation and task-splitting makes it possible to maximize energy savings with DVS. Simulation results show that it is possible to achieve significant energy savings with DVS while preserving the schedulability requirements of real-time schedulers for multi-core processors.
Yao Guo, Youfu Li, Zhanpeng Shao
Motion behaviors of a rigid body can be characterized by a 6-dimensional motion trajectory, which contains position vectors of a reference point on the rigid body and rotations of this rigid body over time. This paper devises a Rotation and Relative Velocity (RRV) descriptor by exploring the local translational and rotational invariants of motion trajectories of rigid bodies, which is insensitive to noise, invariant to rigid transformation and scaling. A flexible metric is also introduced to measure the distance between two RRV descriptors. The RRV descriptor is then applied to characterize motions of a human body skeleton modeled as articulated interconnections of multiple rigid bodies. To illustrate the descriptive ability of the RRV descriptor, we explore it for different rigid body motion recognition tasks. The experimental results on benchmark datasets demonstrate that this simple RRV descriptor outperforms the previous ones regarding recognition accuracy without increasing computational cost.
Guo Yao Tham, Ranjith Nair, Mile Gu
Oct 17, 2023·quant-ph·PDF In covert target detection, Alice attempts to send optical or microwave probes to determine the presence or absence of a weakly-reflecting target embedded in thermal background radiation within a target region, while striving to remain undetected by an adversary, Willie, who is co-located with the target and collects all light that does not return to Alice. We formulate this problem in a realistic setting and derive quantum-mechanical limits on Alice's error probability performance in entanglement-assisted target detection for any fixed level of her detectability by Willie. We demonstrate how Alice can approach this performance limit using two-mode squeezed vacuum probes in the regime of small to moderate background brightness, and how such protocols can outperform any conventional approach using Gaussian-distributed coherent states. In addition, we derive a universal performance bound for non-adversarial quantum illumination without requiring the passive-signature assumption.
Guo Yao Tham
Oct 15, 2024·quant-ph·PDF This thesis presents three studies in quantum-enhanced sensing and target detection. The first study explores covert target detection using optical or microwave probes, establishing quantum-mechanical limits on the error probabilities of entanglement-assisted detection methods while maintaining the sender's covertness. It identifies the minimal energy required to preserve covertness and reduce error probabilities, compares two-mode squeezed vacuum probes and coherent states against these limits, and extends the analysis to discriminating thermal loss channels and non-adversarial quantum illumination. The second study focuses on phase-insensitive optical amplifiers, determining the quantum limit on the precision of gain estimation using multimode probes possibly entangled with ancillary systems. It finds that the average photon number and the number of input modes are interchangeable resources for achieving optimal gain sensing precision, contrasting classical probes with quantum probes and highlighting the advantages of the latter, even with single-photon inputs and inefficient photodetection. It also provides a closed-form expression for the energy-constrained Bures distance between two amplifier channels. The third study compares three probe states -- coherent state, two-mode squeezed vacuum (TMSV), and single-photon entangled state (SPES) -- in quantum-enhanced target detection, assessing their performance under signal energy constraints relevant to covert radar sensing. SPES is uniquely positioned as a practical probe due to its non-classical properties after thermal loss and ease of generation. Numerical analysis shows that at low signal energies, the error exponent of TMSV aligns with SPES, indicating comparable detection capabilities and that SPES surpasses the best classical state-the coherent state-in accuracy for certain signal strengths.
Y. Guo, D. Wu, J. Zhang
Mushroom instability (MI) is a shear instability considered responsible for generating and amplifying magnetic fields in relativistic jets. While astrophysical jets are usually magnetized, how MI acts in magnetized jets remains poorly understood. In this paper, we investigate the effect of a flow-aligned external magnetic field on MI, with both theoretical analyses and particle-in-cell (PIC) simulations. In the limit of a cold and collisionless plasma, we derive a generalized dispersion relation for linear growth rates of the magnetized MIs. Numerical solutions of the dispersion relation reveal that the external magnetic field always suppresses the growth of MI, though MIs are much more robust against the external magnetic field than electron-scale Kelvin-Helmholtz instabilities (ESKHIs). Analyses are also extended to instabilities with an arbitrary wavevector in the shear interface plane, where coupling effect is observed for sub-relativistic scenarios. Two-dimensional PIC simulations of single-mode MIs reach a good agreement with our analytical predictions, and we observe formation of a quasi-steady saturation structure in magnetized runs. In simulations with finite temperatures, we observe the competition and cooperation between MIs and a diffusion-induced DC magnetic field.
Anh Nguyen, Dennis Kundrat, Giulio Dagnino, Wenqiang Chi, Mohamed E. M. K. Abdelaziz, Yao Guo, YingLiang Ma, Trevor M. Y. Kwok, Celia Riga, Guang-Zhong Yang
Accurate real-time catheter segmentation is an important pre-requisite for robot-assisted endovascular intervention. Most of the existing learning-based methods for catheter segmentation and tracking are only trained on small-scale datasets or synthetic data due to the difficulties of ground-truth annotation. Furthermore, the temporal continuity in intraoperative imaging sequences is not fully utilised. In this paper, we present FW-Net, an end-to-end and real-time deep learning framework for endovascular intervention. The proposed FW-Net has three modules: a segmentation network with encoder-decoder architecture, a flow network to extract optical flow information, and a novel flow-guided warping function to learn the frame-to-frame temporal continuity. We show that by effectively learning temporal continuity, the network can successfully segment and track the catheters in real-time sequences using only raw ground-truth for training. Detailed validation results confirm that our FW-Net outperforms state-of-the-art techniques while achieving real-time performance.
Xiao-Yun Zhou, Yao Guo, Mali Shen, Guang-Zhong Yang
Artificial Intelligence (AI) is gradually changing the practice of surgery with the advanced technological development of imaging, navigation and robotic intervention. In this article, the recent successful and influential applications of AI in surgery are reviewed from pre-operative planning and intra-operative guidance to the integration of surgical robots. We end with summarizing the current state, emerging trends and major challenges in the future development of AI in surgery.
Yuanchun Li, Ziyue Yang, Yao Guo, Xiangqun Chen, Yuvraj Agarwal, Jason Hong
Personalized services are in need of a rich and powerful personal knowledge base, i.e. a knowledge base containing information about the user. This paper proposes an approach to extracting personal knowledge from smartphone push notifications, which are used by mobile systems and apps to inform users of a rich range of information. Our solution is based on the insight that most notifications are formatted using templates, while knowledge entities can be usually found within the parameters to the templates. As defining all the notification templates and their semantic rules are impractical due to the huge number of notification templates used by potentially millions of apps, we propose an automated approach for personal knowledge extraction from push notifications. We first discover notification templates through pattern mining, then use machine learning to understand the template semantics. Based on the templates and their semantics, we are able to translate notification text into knowledge facts automatically. Users' privacy is preserved as we only need to upload the templates to the server for model training, which do not contain any personal information. According to our experiments with about 120 million push notifications from 100,000 smartphone users, our system is able to extract personal knowledge accurately and efficiently.
Ningyu He, Weihang Su, Zhou Yu, Xinyu Liu, Fengyi Zhao, Haoyu Wang, Xiapu Luo, Gareth Tyson, Lei Wu, Yao Guo
The continuing expansion of the blockchain ecosystems has attracted much attention from the research community. However, although a large number of research studies have been proposed to understand the diverse characteristics of individual blockchain systems (e.g., Bitcoin or Ethereum), little is known at a comprehensive level on the evolution of blockchain ecosystems at scale, longitudinally, and across multiple blockchains. We argue that understanding the dynamics of blockchain ecosystems could provide unique insights that cannot be achieved through studying a single static snapshot or a single blockchain network alone. Based on billions of transaction records collected from three representative and popular blockchain systems (Bitcoin, Ethereum and EOSIO) over 10 years, we conduct the first study on the evolution of multiple blockchain ecosystems from different perspectives. Our exploration suggests that, although the overall blockchain ecosystem shows promising growth over the last decade, a number of worrying outliers exist that have disrupted its evolution.
Yuanchun Li, Ziyue Yang, Yao Guo, Xiangqun Chen
Automated input generators are widely used for large-scale dynamic analysis of mobile apps. Such input generators must constantly choose which UI element to interact with and how to interact with it, in order to achieve high coverage with a limited time budget. Currently, most input generators adopt pseudo-random or brute-force searching strategies, which may take very long to find the correct combination of inputs that can drive the app into new and important states. In this paper, we propose Humanoid, a deep learning-based approach to GUI test input generation by learning from human interactions. Our insight is that if we can learn from human-generated interaction traces, it is possible to automatically prioritize test inputs based on their importance as perceived by users. We design and implement a deep neural network model to learn how end-users would interact with an app (specifically, which UI elements to interact with and how). Our experiments showed that the interaction model can successfully prioritize user-preferred inputs for any new UI (with a top-1 accuracy of 51.2% and a top-10 accuracy of 85.2%). We implemented an input generator for Android apps based on the learned model and evaluated it on both open-source apps and market apps. The results indicated that Humanoid was able to achieve higher coverage than six state-of-the-art test generators. However, further analysis showed that the learned model was not the main reason of coverage improvement. Although the learned interaction pattern could drive the app into some important GUI states with higher probabilities, it had limited effect on the width and depth of GUI state search, which is the key to improve test coverage in the long term. Whether and how human interaction patterns can be used to improve coverage is still an unknown and challenging problem.
Ningyu He, Haoyu Wang, Lei Wu, Xiapu Luo, Yao Guo, Xiangqun Chen
EOSIO, as one of the most representative blockchain 3.0 platforms, involves lots of new features, e.g., delegated proof of stake consensus algorithm and updatable smart contracts, enabling a much higher transaction per second and the prosperous decentralized applications (DApps) ecosystem. According to the statistics, it has reached nearly 18 billion USD, taking the third place of the whole cryptocurrency market, following Bitcoin and Ethereum. Loopholes, however, are hiding in the shadows. EOSBet, a famous gambling DApp, was attacked twice within a month and lost more than 1 million USD. No existing work has surveyed the EOSIO from a security researcher perspective. To fill this gap, in this paper, we collected all occurred attack events against EOSIO, and systematically studied their root causes, i.e., vulnerabilities lurked in all relying components for EOSIO, as well as the corresponding attacks and mitigations. We also summarized some best practices for DApp developers, EOSIO official team, and security researchers for future directions.
Jingru Zhu, Ya Guo, Geng Sun, Libo Yang, Min Deng, Jie Chen
Semantic segmentation is a key technique involved in automatic interpretation of high-resolution remote sensing (HRS) imagery and has drawn much attention in the remote sensing community. Deep convolutional neural networks (DCNNs) have been successfully applied to the HRS imagery semantic segmentation task due to their hierarchical representation ability. However, the heavy dependency on a large number of training data with dense annotation and the sensitiveness to the variation of data distribution severely restrict the potential application of DCNNs for the semantic segmentation of HRS imagery. This study proposes a novel unsupervised domain adaptation semantic segmentation network (MemoryAdaptNet) for the semantic segmentation of HRS imagery. MemoryAdaptNet constructs an output space adversarial learning scheme to bridge the domain distribution discrepancy between source domain and target domain and to narrow the influence of domain shift. Specifically, we embed an invariant feature memory module to store invariant domain-level context information because the features obtained from adversarial learning only tend to represent the variant feature of current limited inputs. This module is integrated by a category attention-driven invariant domain-level context aggregation module to current pseudo invariant feature for further augmenting the pixel representations. An entropy-based pseudo label filtering strategy is used to update the memory module with high-confident pseudo invariant feature of current target images. Extensive experiments under three cross-domain tasks indicate that our proposed MemoryAdaptNet is remarkably superior to the state-of-the-art methods.
Dandan Zhang, Zicong Wu, Junhong Chen, Ruiqi Zhu, Adnan Munawar, Bo Xiao, Yuan Guan, Hang Su, Wuzhou Hong, Yao Guo, Gregory S. Fischer, Benny Lo, Guang-Zhong Yang
Human-robot shared control, which integrates the advantages of both humans and robots, is an effective approach to facilitate efficient surgical operation. Learning from demonstration (LfD) techniques can be used to automate some of the surgical subtasks for the construction of the shared control framework. However, a sufficient amount of data is required for the robot to learn the manoeuvres. Using a surgical simulator to collect data is a less resource-demanding approach. With sim-to-real adaptation, the manoeuvres learned from a simulator can be transferred to a physical robot. To this end, we propose a sim-to-real adaptation method to construct a human-robot shared control framework for robotic surgery. In this paper, a desired trajectory is generated from a simulator using LfD method, while dynamic motion primitives (DMPs) based method is used to transfer the desired trajectory from the simulator to the physical robotic platform. Moreover, a role adaptation mechanism is developed such that the robot can adjust its role according to the surgical operation contexts predicted by a neural network model. The effectiveness of the proposed framework is validated on the da Vinci Research Kit (dVRK). Results of the user studies indicated that with the adaptive human-robot shared control framework, the path length of the remote controller, the total clutching number and the task completion time can be reduced significantly. The proposed method outperformed the traditional manual control via teleoperation.