"au:"Kwunhang Wong"" — arXiv2 Search

Showing 1–5 of 5 results

SNNGX: Securing Spiking Neural Networks with Genetic XOR Encryption on RRAM-based Neuromorphic Accelerator

Kwunhang Wong, Songqi Wang, Wei Huang, Xinyuan Zhang, Yangu He, Karl M. H. Lai, Yuzhong Jiao, Ning Lin, Xiaojuan Qi, Xiaoming Chen, Zhongrui Wang

Jul 21, 2024·cs.CR·PDF

Biologically plausible Spiking Neural Networks (SNNs), characterized by spike sparsity, are growing tremendous attention over intellectual edge devices and critical bio-medical applications as compared to artificial neural networks (ANNs). However, there is a considerable risk from malicious attempts to extract white-box information (i.e., weights) from SNNs, as attackers could exploit well-trained SNNs for profit and white-box adversarial concerns. There is a dire need for intellectual property (IP) protective measures. In this paper, we present a novel secure software-hardware co-designed RRAM-based neuromorphic accelerator for protecting the IP of SNNs. Software-wise, we design a tailored genetic algorithm with classic XOR encryption to target the least number of weights that need encryption. From a hardware perspective, we develop a low-energy decryption module, meticulously designed to provide zero decryption latency. Extensive results from various datasets, including NMNIST, DVSGesture, EEGMMIDB, Braille Letter, and SHD, demonstrate that our proposed method effectively secures SNNs by encrypting a minimal fraction of stealthy weights, only 0.00005% to 0.016% weight bits. Additionally, it achieves a substantial reduction in energy consumption, ranging from x59 to x6780, and significantly lowers decryption latency, ranging from x175 to x4250. Moreover, our method requires as little as one sample per class in dataset for encryption and addresses hessian/gradient-based search insensitive problems. This strategy offers a highly efficient and flexible solution for securing SNNs in diverse applications.

When Pipelined In-Memory Accelerators Meet Spiking Direct Feedback Alignment: A Co-Design for Neuromorphic Edge Computing

Haoxiong Ren, Yangu He, Kwunhang Wong, Rui Bao, Ning Lin, Zhongrui Wang, Dashan Shang

Jul 21, 2025·cs.AR·PDF

Spiking Neural Networks (SNNs) are increasingly favored for deployment on resource-constrained edge devices due to their energy-efficient and event-driven processing capabilities. However, training SNNs remains challenging because of the computational intensity of traditional backpropagation algorithms adapted for spike-based systems. In this paper, we propose a novel software-hardware co-design that introduces a hardware-friendly training algorithm, Spiking Direct Feedback Alignment (SDFA) and implement it on a Resistive Random Access Memory (RRAM)-based In-Memory Computing (IMC) architecture, referred to as PipeSDFA, to accelerate SNN training. Software-wise, the computational complexity of SNN training is reduced by the SDFA through the elimination of sequential error propagation. Hardware-wise, a three-level pipelined dataflow is designed based on IMC architecture to parallelize the training process. Experimental results demonstrate that the PipeSDFA training accelerator incurs less than 2% accuracy loss on five datasets compared to baselines, while achieving 1.1X~10.5X and 1.37X~2.1X reductions in training time and energy consumption, respectively compared to PipeLayer.

Resistive memory-based zero-shot liquid state machine for multimodal event data learning

Ning Lin, Shaocong Wang, Yi Li, Bo Wang, Shuhui Shi, Yangu He, Woyu Zhang, Yifei Yu, Yue Zhang, Xinyuan Zhang, Kwunhang Wong, Songqi Wang, Xiaoming Chen, Hao Jiang, Xumeng Zhang, Peng Lin, Xiaoxin Xu, Xiaojuan Qi, Zhongrui Wang, Dashan Shang, Qi Liu, Ming Liu

Jul 3, 2023·cs.ET·PDF

The human brain is a complex spiking neural network (SNN), capable of learning multimodal signals in a zero-shot manner by generalizing existing knowledge. Remarkably, it maintains minimal power consumption through event-based signal propagation. However, replicating the human brain in neuromorphic hardware presents both hardware and software challenges. Hardware limitations, such as the slowdown of Moore's law and Von Neumann bottleneck, hinder the efficiency of digital computers. Additionally, SNNs are characterized by their software training complexities. To this end, we propose a hardware-software co-design on a 40 nm 256 Kb in-memory computing macro that physically integrates a fixed and random liquid state machine (LSM) SNN encoder with trainable artificial neural network (ANN) projections. We showcase the zero-shot LSM-based learning of multimodal events on the N-MNIST and N-TIDIGITS datasets, including visual and audio data association, as well as neural and visual data alignment for brain-machine interfaces. Our co-design achieves classification accuracy comparable to fully optimized software models, resulting in a 152.83 and 393.07-fold reduction in training costs compared to SOTA contrastive language-image pre-training (CLIP) and Prototypical networks, and a 23.34 and 160-fold improvement in energy efficiency compared to cutting-edge digital hardware, respectively. These proof-of-principle prototypes demonstrate zero-shot multimodal events learning capability for emerging efficient and compact neuromorphic hardware.

Older and Wiser: The Marriage of Device Aging and Intellectual Property Protection of Deep Neural Networks

Ning Lin, Shaocong Wang, Yue Zhang, Yangu He, Kwunhang Wong, Arindam Basu, Dashan Shang, Xiaoming Chen, Zhongrui Wang

Jun 21, 2024·cs.CR·PDF

Deep neural networks (DNNs), such as the widely-used GPT-3 with billions of parameters, are often kept secret due to high training costs and privacy concerns surrounding the data used to train them. Previous approaches to securing DNNs typically require expensive circuit redesign, resulting in additional overheads such as increased area, energy consumption, and latency. To address these issues, we propose a novel hardware-software co-design approach for DNN intellectual property (IP) protection that capitalizes on the inherent aging characteristics of circuits and a novel differential orientation fine-tuning (DOFT) to ensure effective protection. Hardware-wise, we employ random aging to produce authorized chips. This process circumvents the need for chip redesign, thereby eliminating any additional hardware overhead during the inference procedure of DNNs. Moreover, the authorized chips demonstrate a considerable disparity in DNN inference performance when compared to unauthorized chips. Software-wise, we propose a novel DOFT, which allows pre-trained DNNs to maintain their original accuracy on authorized chips with minimal fine-tuning, while the model's performance on unauthorized chips is reduced to random guessing. Extensive experiments on various models, including MLP, VGG, ResNet, Mixer, and SwinTransformer, with lightweight binary and practical multi-bit weights demonstrate that the proposed method achieves effective IP protection, with only 10\% accuracy on unauthorized chips, while preserving nearly the original accuracy on authorized ones.

Towards Secure and Efficient DNN Accelerators via Hardware-Software Co-Design

Wei Xuan, Zihao Xuan, Rongliang Fu, Ning Lin, Kwunhang Wong, Zikang Yuan, Lang Feng, Zhongrui Wang, Tsung-Yi Ho, Yuzhong Jiao, Luhong Liang

Feb 24, 2026·cs.CR·PDF

The rapid deployment of deep neural network (DNN) accelerators in safety-critical domains such as autonomous vehicles, healthcare systems, and financial infrastructure necessitates robust mechanisms to safeguard data confidentiality and computational integrity. Existing security solutions for DNN accelerators, however, suffer from excessive hardware resource demands and frequent off-chip memory access overheads, which degrade performance and scalability. To address these challenges, this paper presents a secure and efficient memory protection framework for DNN accelerators with minimal overhead. First, we propose a bandwidth-aware cryptographic scheme that adapts encryption granularity based on memory traffic patterns, striking a balance between security and resource efficiency. Second, we observe that both the overlapping regions in the intra-layer tiling's sliding window pattern and those resulting from inter-layer tiling strategy discrepancies introduce substantial redundant memory accesses and repeated computational overhead in cryptography. Third, we introduce a multi-level authentication mechanism that effectively eliminates unnecessary off-chip memory accesses, enhancing performance and energy efficiency. Experimental results show that this work decreases performance overhead by over 12% and achieves 87% energy efficiency improvement for both server and edge neural processing units (NPUs), while ensuring robust scalability.

Resistive memory-based zero-shot liquid state machine for multimodal event data learning

Jul 3, 2023·cs.ET·PDF