Shangtong Cao, Ningyu He, Yao Guo, Haoyu Wang
Binary rewriting is a widely adopted technique in software analysis. WebAssembly (Wasm), as an emerging bytecode format, has attracted great attention from our community. Unfortunately, there is no general-purpose binary rewriting framework for Wasm, and existing effort on Wasm binary modification is error-prone and tedious. In this paper, we present BREWasm, the first general purpose static binary rewriting framework for Wasm, which has addressed inherent challenges of Wasm rewriting including high complicated binary structure, strict static syntax verification, and coupling among sections. We perform extensive evaluation on diverse Wasm applications to show the efficiency, correctness and effectiveness of BREWasm. We further show the promising direction of implementing a diverse set of binary rewriting tasks based on BREWasm in an effortless and user-friendly manner.
Tianyang Chi, Ningyu He, Xiaohui Hu, Haoyu Wang
Maximal Extractable Value (MEV) drives the prosperity of the blockchain ecosystem. By strategically including, excluding, or reordering transactions within blocks, block producers can extract additional value, which in turn incentivizes them to keep the decentralization of the whole blockchain platform. Before September 2022, around $675M was extracted in terms of MEV in Ethereum. Despite its importance, current work on identifying MEV activities suffers from two limitations. On the one hand, current methods heavily rely on clumsy heuristic rule-based patterns, leading to numerous false negatives or positives. On the other hand, the observations and conclusions are drawn from the early stage of Ethereum, which cannot be used as effective guiding principles after The Merge. To address these challenges, in this work, we innovatively proposed a profitability identification algorithm. Based on this, we designed two robust algorithms to identify MEV activities on our collected largest-ever dataset. Based on the identified results, we have characterized the overall landscape of the Ethereum MEV ecosystem, the impact the private transaction architectures bring in, and the adoption of back-running mechanisms. Our research sheds light on future MEV-related work.
Ningyu He, Weihang Su, Zhou Yu, Xinyu Liu, Fengyi Zhao, Haoyu Wang, Xiapu Luo, Gareth Tyson, Lei Wu, Yao Guo
The continuing expansion of the blockchain ecosystems has attracted much attention from the research community. However, although a large number of research studies have been proposed to understand the diverse characteristics of individual blockchain systems (e.g., Bitcoin or Ethereum), little is known at a comprehensive level on the evolution of blockchain ecosystems at scale, longitudinally, and across multiple blockchains. We argue that understanding the dynamics of blockchain ecosystems could provide unique insights that cannot be achieved through studying a single static snapshot or a single blockchain network alone. Based on billions of transaction records collected from three representative and popular blockchain systems (Bitcoin, Ethereum and EOSIO) over 10 years, we conduct the first study on the evolution of multiple blockchain ecosystems from different perspectives. Our exploration suggests that, although the overall blockchain ecosystem shows promising growth over the last decade, a number of worrying outliers exist that have disrupted its evolution.
Ningyu He, Haoyu Wang, Lei Wu, Xiapu Luo, Yao Guo, Xiangqun Chen
EOSIO, as one of the most representative blockchain 3.0 platforms, involves lots of new features, e.g., delegated proof of stake consensus algorithm and updatable smart contracts, enabling a much higher transaction per second and the prosperous decentralized applications (DApps) ecosystem. According to the statistics, it has reached nearly 18 billion USD, taking the third place of the whole cryptocurrency market, following Bitcoin and Ethereum. Loopholes, however, are hiding in the shadows. EOSBet, a famous gambling DApp, was attacked twice within a month and lost more than 1 million USD. No existing work has surveyed the EOSIO from a security researcher perspective. To fill this gap, in this paper, we collected all occurred attack events against EOSIO, and systematically studied their root causes, i.e., vulnerabilities lurked in all relying components for EOSIO, as well as the corresponding attacks and mitigations. We also summarized some best practices for DApp developers, EOSIO official team, and security researchers for future directions.
Xiaohui Hu, Ningyu He, Haoyu Wang
Serving as the first touch point for users to the cryptocurrency world, cryptocurrency wallets allow users to manage, receive, and transmit digital assets on blockchain networks and interact with emerging decentralized finance (DeFi) applications. Unfortunately, cryptocurrency wallets have always been the prime targets for attackers, and incidents of wallet breaches have been reported from time to time. Although some recent studies have characterized the vulnerabilities and scams related to wallets, they have generally been characterized in coarse granularity, overlooking potential risks inherent in detailed designs of cryptocurrency wallets, especially from perspectives including user interaction and advanced features. To fill the void, in this paper, we present a fine-grained security analysis on browser-based cryptocurrency wallets. To pinpoint security issues of components in wallets, we design WalletProbe, a mutation-based testing framework based on visual-level oracles. We have identified 13 attack vectors that can be abused by attackers to exploit cryptocurrency wallets and exposed 21 concrete attack strategies. By applying WalletProbe on 39 widely-adopted browser-based wallet extensions, we astonishingly figure out all of them can be abused to steal crypto assets from innocent users. Identified potential attack vectors were reported to wallet developers timely and 26 issues have been patched already. It is, hence, urgent for our community to take action to mitigate threats related to cryptocurrency wallets. We promise to release all code and data to promote the development of the community.
Edward Lo, Ningyu He, Yuejie Shi, Jiajia Xu, Chiachih Wu, Ding Li, Yao Guo
Recently, the first feature-rich NTFS implementation, NTFS3, has been upstreamed to Linux. Although ensuring the security of NTFS3 is essential for the future of Linux, it remains unclear, however, whether the most recent version of NTFS for Linux contains 0-day vulnerabilities. To this end, we implemented Papora, the first effective fuzzer for NTFS3. We have identified and reported 3 CVE-assigned 0-day vulnerabilities and 9 severe bugs in NTFS3. Furthermore, we have investigated the underlying causes as well as types of these vulnerabilities and bugs. We have conducted an empirical study on the identified bugs while the results of our study have offered practical insights regarding the security of NTFS3 in Linux.
Ningyu He, Zhehao Zhao, Jikai Wang, Yubin Hu, Shengjian Guo, Haoyu Wang, Guangtai Liang, Ding Li, Xiangqun Chen, Yao Guo
Although existing techniques have proposed automated approaches to alleviate the path explosion problem of symbolic execution, users still need to optimize symbolic execution by applying various searching strategies carefully. As existing approaches mainly support only coarse-grained global searching strategies, they cannot efficiently traverse through complex code structures. In this paper, we propose Eunomia, a symbolic execution technique that allows users to specify local domain knowledge to enable fine-grained search. In Eunomia, we design an expressive DSL, Aes, that lets users precisely pinpoint local searching strategies to different parts of the target program. To further optimize local searching strategies, we design an interval-based algorithm that automatically isolates the context of variables for different local searching strategies, avoiding conflicts between local searching strategies for the same variable. We implement Eunomia as a symbolic execution platform targeting WebAssembly, which enables us to analyze applications written in various languages (like C and Go) but can be compiled into WebAssembly. To the best of our knowledge, Eunomia is the first symbolic execution engine that supports the full features of the WebAssembly runtime. We evaluate Eunomia with a dedicated microbenchmark suite for symbolic execution and six real-world applications. Our evaluation shows that Eunomia accelerates bug detection in real-world applications by up to three orders of magnitude. According to the results of a comprehensive user study, users can significantly improve the efficiency and effectiveness of symbolic execution by writing a simple and intuitive Aes script. Besides verifying six known real-world bugs, Eunomia also detected two new zero-day bugs in a popular open-source project, Collections-C.
Xiaohui Hu, Wun Yu Chan, Yuejie Shi, Qumeng Sun, Wei-Cheng Wang, Chiachih Wu, Haoyu Wang, Ningyu He
Smart contract security is paramount, but identifying intricate business logic vulnerabilities remains a persistent challenge because existing solutions consistently fall short: manual auditing is unscalable, static analysis tools are plagued by false positives, and fuzzers struggle to navigate deep logic states within complex systems. Even emerging AI-based methods suffer from hallucinations, context constraints, and a heavy reliance on expensive, proprietary Large Language Models. In this paper, we introduce Heimdallr, an automated auditing agent designed to overcome these hurdles through four core innovations. By reorganizing code at the function level, Heimdallr minimizes context overhead while preserving essential business logic. It then employs heuristic reasoning to detect complex vulnerabilities and automatically chain functional exploits. Finally, a cascaded verification layer validates these findings to eliminate false positives. Notably, this approach achieves high performance on lightweight, open-source models like GPToss-120B without relying on proprietary systems. Our evaluations demonstrate exceptional performance, as Heimdallr successfully reconstructed 17 out of 20 real-world attacks post June 2025, resulting in total losses of $384M, and uncovered 4 confirmed zero-day vulnerabilities that safeguarded $400M in TVL. Compared to SOTA baselines including both official industrial tools and academic tools, Heimdallr at most reduces analysis time by 97.59% and financial costs by 98.77% while boosting detection precision by over 93.66%. Notably, when applied to auditing contests, Heimdallr can achieve a 92.45% detection rate at a negligible cost of $2.31 per 10K LOC. We provide production-ready auditing services and release valuable benchmarks for future work.
Pengxiang Ma, Ningyu He, Yuhua Huang, Haoyu Wang, Xiapu Luo
Smart contracts play a vital role in the Ethereum ecosystem. Due to the prevalence of kinds of security issues in smart contracts, the smart contract verification is urgently needed, which is the process of matching a smart contract's source code to its on-chain bytecode for gaining mutual trust between smart contract developers and users. Although smart contract verification services are embedded in both popular Ethereum browsers (e.g., Etherscan and Blockscout) and official platforms (i.e., Sourcify), and gain great popularity in the ecosystem, their security and trustworthiness remain unclear. To fill the void, we present the first comprehensive security analysis of smart contract verification services in the wild. By diving into the detailed workflow of existing verifiers, we have summarized the key security properties that should be met, and observed eight types of vulnerabilities that can break the verification. Further, we propose a series of detection and exploitation methods to reveal the presence of vulnerabilities in the most popular services, and uncover 19 exploitable vulnerabilities in total. All the studied smart contract verification services can be abused to help spread malicious smart contracts, and we have already observed the presence of using this kind of tricks for scamming by attackers. It is hence urgent for our community to take actions to detect and mitigate security issues related to smart contract verification, a key component of the Ethereum smart contract ecosystem.
Ningyu He, Lei Wu, Haoyu Wang, Yao Guo, Xuxian Jiang
In this paper, we present the first large-scale and systematic study to characterize the code reuse practice in the Ethereum smart contract ecosystem. We first performed a detailed similarity comparison study on a dataset of 10 million contracts we had harvested, and then we further conducted a qualitative analysis to characterize the diversity of the ecosystem, understand the correlation between code reuse and vulnerabilities, and detect the plagiarist DApps. Our analysis revealed that over 96% of the contracts had duplicates, while a large number of them were similar, which suggests that the ecosystem is highly homogeneous. Our results also suggested that roughly 9.7% of the similar contract pairs have exactly the same vulnerabilities, which we assume were introduced by code clones. In addition, we identified 41 DApps clusters, involving 73 plagiarized DApps which had caused huge financial loss to the original creators, accounting for 1/3 of the original market volume.
Ningyu He, Ruiyi Zhang, Lei Wu, Haoyu Wang, Xiapu Luo, Yao Guo, Ting Yu, Xuxian Jiang
The EOSIO blockchain, one of the representative Delegated Proof-of-Stake (DPoS) blockchain platforms, has grown rapidly recently. Meanwhile, a number of vulnerabilities and high-profile attacks against top EOSIO DApps and their smart contracts have also been discovered and observed in the wild, resulting in serious financial damages. Most of EOSIO's smart contracts are not open-sourced and they are typically compiled to WebAssembly (Wasm) bytecode, thus making it challenging to analyze and detect the presence of possible vulnerabilities. In this paper, we propose EOSAFE, the first static analysis framework that can be used to automatically detect vulnerabilities in EOSIO smart contracts at the bytecode level. Our framework includes a practical symbolic execution engine for Wasm, a customized library emulator for EOSIO smart contracts, and four heuristics-driven detectors to identify the presence of four most popular vulnerabilities in EOSIO smart contracts. Experiment results suggest that EOSAFE achieves promising results in detecting vulnerabilities, with an F1-measure of 98%. We have applied EOSAFE to all active 53,666 smart contracts in the ecosystem (as of November 15, 2019). Our results show that over 25% of the smart contracts are vulnerable. We further analyze possible exploitation attempts against these vulnerable smart contracts and identify 48 in-the-wild attacks (25 of them have been confirmed by DApp developers), resulting in financial loss of at least 1.7 million USD.
Ningyu He, Zhehao Zhao, Hanqin Guan, Jikai Wang, Shuo Peng, Ding Li, Haoyu Wang, Xiangqun Chen, Yao Guo
WebAssembly (Wasm), as a compact, fast, and isolation-guaranteed binary format, can be compiled from more than 40 high-level programming languages. However, vulnerabilities in Wasm binaries could lead to sensitive data leakage and even threaten their hosting environments. To identify them, symbolic execution is widely adopted due to its soundness and the ability to automatically generate exploitations. However, existing symbolic executors for Wasm binaries are typically platform-specific, which means that they cannot support all Wasm features. They may also require significant manual interventions to complete the analysis and suffer from efficiency issues as well. In this paper, we propose an efficient and fully-functional symbolic execution engine, named SeeWasm. Compared with existing tools, we demonstrate that SeeWasm supports full-featured Wasm binaries without further manual intervention, while accelerating the analysis by 2 to 6 times. SeeWasm has been adopted by existing works to identify more than 30 0-day vulnerabilities or security issues in well-known C, Go, and SGX applications after compiling them to Wasm binaries.
Ningyu He, Shangtong Cao, Haoyu Wang, Yao Guo, Xiapu Luo
As JavaScript has been criticized for performance and security issues in web applications, WebAssembly (Wasm) was proposed in 2017 and is regarded as the complementation for JavaScript. Due to its advantages like compact-size, native-like speed, and portability, Wasm binaries are gradually used as the compilation target for industrial projects in other high-level programming languages and are responsible for computation-intensive tasks in browsers, e.g., 3D graphic rendering and video decoding. Intuitively, characterizing in-the-wild adopted Wasm binaries from different perspectives, like their metadata, relation with source programming language, existence of security threats, and practical purpose, is the prerequisite before delving deeper into the Wasm ecosystem and beneficial to its roadmap selection. However, currently, there is no work that conducts a large-scale measurement study on in-the-wild adopted Wasm binaries. To fill this gap, we collect the largest-ever dataset to the best of our knowledge, and characterize the status quo of them from industry perspectives. According to the different roles of people engaging in the community, i.e., web developers, Wasm maintainers, and researchers, we reorganized our findings to suggestions and best practices for them accordingly. We believe this work can shed light on the future direction of the web and Wasm.
Jintao Huang, Ningyu He, Kai Ma, Jiang Xiao, Haoyu Wang
NFT rug pull is one of the most prominent type of scam that the developers of a project abandon it and then run away with investors' funds. Although they have drawn attention from our community, to the best of our knowledge, the NFT rug pulls have not been systematically explored. To fill the void, this paper presents the first in-depth study of NFT rug pulls. Specifically, we first compile a list of 253 known NFT rug pulls as our initial ground truth, based on which we perform a pilot study, highlighting the key symptoms of NFT rug pulls. Then, we enforce a strict rule-based method to flag more rug pulled NFT projects in the wild, and have labelled 7,487 NFT rug pulls as our extended ground truth. Atop it, we have investigated the art of NFT rug pulls, with kinds of tricks including explicit ones that are embedded with backdoors, and implicit ones that manipulate the market. To release the expansion of the scam, we further design a prediction model to proactively identify the potential rug pull projects in an early stage ahead of the scam happens. We have implemented a prototype system deployed in the real-world setting for over 5 months. Our system has raised alarms for 7,821 NFT projects, by the time of this writing, which can work as a whistle blower that pinpoints rug pull scams timely, thus mitigating the impacts.
Ru Ji, Ningyu He, Lei Wu, Haoyu Wang, Guangdong Bai, Yao Guo
Cryptocurrency has seen an explosive growth in recent years, thanks to the evolvement of blockchain technology and its economic ecosystem. Besides Bitcoin, thousands of cryptocurrencies have been distributed on blockchains, while hundreds of cryptocurrency exchanges are emerging to facilitate the trading of digital assets. At the same time, it also attracts the attentions of attackers. Fake deposit, as one of the most representative attacks (vulnerabilities) related to exchanges and tokens, has been frequently observed in the blockchain ecosystem, causing large financial losses. However, besides a few security reports, our community lacks of the understanding of this vulnerability, for example its scale and the impacts. In this paper, we take the first step to demystify the fake deposit vulnerability. Based on the essential patterns we have summarized, we implement DEPOSafe, an automated tool to detect and verify (exploit) the fake deposit vulnerability in ERC-20 smart contracts. DEPOSafe incorporates several key techniques including symbolic execution based static analysis and behavior modeling based dynamic verification. By applying DEPOSafe to 176,000 ERC-20 smart contracts, we have identified over 7,000 vulnerable contracts that may suffer from two types of attacks. Our findings demonstrate the urgency to identify and prevent the fake deposit vulnerability.
Tianle Sun, Ningyu He, Jiang Xiao, Yinliang Yue, Xiapu Luo, Haoyu Wang
In Ethereum, the practice of verifying the validity of the passed addresses is a common practice, which is a crucial step to ensure the secure execution of smart contracts. Vulnerabilities in the process of address verification can lead to great security issues, and anecdotal evidence has been reported by our community. However, this type of vulnerability has not been well studied. To fill the void, in this paper, we aim to characterize and detect this kind of emerging vulnerability. We design and implement AVVERIFIER, a lightweight taint analyzer based on static EVM opcode simulation. Its three-phase detector can progressively rule out false positives and false negatives based on the intrinsic characteristics. Upon a well-established and unbiased benchmark, AVVERIFIER can improve efficiency 2 to 5 times than the SOTA while maintaining a 94.3% precision and 100% recall. After a large-scale evaluation of over 5 million Ethereum smart contracts, we have identified 812 vulnerable smart contracts that were undisclosed by our community before this work, and 348 open source smart contracts were further verified, whose largest total value locked is over $11.2 billion. We further deploy AVVERIFIER as a real-time detector on Ethereum and Binance Smart Chain, and the results suggest that AVVERIFIER can raise timely warnings once contracts are deployed.
Shangtong Cao, Ningyu He, Yao Guo, Haoyu Wang
WebAssembly (Wasm) is an emerging binary format that draws great attention from our community. However, Wasm binaries are weakly protected, as they can be read, edited, and manipulated by adversaries using either the officially provided readable text format (i.e., wat) or some advanced binary analysis tools. Reverse engineering of Wasm binaries is often used for nefarious intentions, e.g., identifying and exploiting both classic vulnerabilities and Wasm specific vulnerabilities exposed in the binaries. However, no Wasm-specific obfuscator is available in our community to secure the Wasm binaries. To fill the gap, in this paper, we present WASMixer, the first general-purpose Wasm binary obfuscator, enforcing data-level (string literals and function names) and code-level (control flow and instructions) obfuscation for Wasm binaries. We propose a series of key techniques to overcome challenges during Wasm binary rewriting, including an on-demand decryption method to minimize the impact brought by decrypting the data in memory area, and code splitting/reconstructing algorithms to handle structured control flow in Wasm. Extensive experiments demonstrate the correctness, effectiveness and efficiency of WASMixer. Our research has shed light on the promising direction of Wasm binary research, including Wasm code protection, Wasm binary diversification, and the attack-defense arm race of Wasm binaries.
Pengcheng Xia, Yanhui Guo, Zhaowen Lin, Jun Wu, Pengbo Duan, Ningyu He, Kailong Wang, Tianming Liu, Yinliang Yue, Guoai Xu, Haoyu Wang
Cryptocurrency wallets, acting as fundamental infrastructure to the blockchain ecosystem, have seen significant user growth, particularly among browser-based wallets (i.e., browser extensions). However, this expansion accompanies security challenges, making these wallets prime targets for malicious activities. Despite a substantial user base, there is not only a significant gap in comprehensive security analysis but also a pressing need for specialized tools that can aid developers in reducing vulnerabilities during the development process. To fill the void, we present a comprehensive security analysis of browser-based wallets in this paper, along with the development of an automated tool designed for this purpose. We first compile a taxonomy of security vulnerabilities resident in cryptocurrency wallets by harvesting historical security reports. Based on this, we design WALLETRADAR, an automated detection framework that can accurately identify security issues based on static and dynamic analysis. Evaluation of 96 popular browser-based wallets shows WALLETRADAR's effectiveness, by successfully automating the detection process in 90% of these wallets with high precision. This evaluation has led to the discovery of 116 security vulnerabilities corresponding to 70 wallets. By the time of this paper, we have received confirmations of 10 vulnerabilities from 8 wallet developers, with over $2,000 bug bounties. Further, we observed that 12 wallet developers have silently fixed 16 vulnerabilities after our disclosure. WALLETRADAR can effectively automate the identification of security risks in cryptocurrency wallets, thereby enhancing software development quality and safety in the blockchain ecosystem.
Bosi Zhang, Ningyu He, Xiaohui Hu, Kai Ma, Haoyu Wang
Price manipulation attack is one of the notorious threats in decentralized finance (DeFi) applications, which allows attackers to exchange tokens at an extensively deviated price from the market. Existing efforts usually rely on reactive methods to identify such kind of attacks after they have happened, e.g., detecting attack transactions in the post-attack stage, which cannot mitigate or prevent price manipulation attacks timely. From the perspective of attackers, they usually need to deploy attack contracts in the pre-attack stage. Thus, if we can identify these attack contracts in a proactive manner, we can raise alarms and mitigate the threats. With the core idea in mind, in this work, we shift our attention from the victims to the attackers. Specifically, we propose SMARTCAT, a novel approach for identifying price manipulation attacks in the pre-attack stage proactively. For generality, it conducts analysis on bytecode and does not require any source code and transaction data. For accuracy, it depicts the control- and data-flow dependency relationships among function calls into a token flow graph. For scalability, it filters out those suspicious paths, in which it conducts inter-contract analysis as necessary. To this end, SMARTCAT can pinpoint attacks in real time once they have been deployed on a chain. The evaluation results illustrate that SMARTCAT significantly outperforms existing baselines with 91.6% recall and ~100% precision. Moreover, SMARTCAT also uncovers 616 attack contracts in-the-wild, accounting for \$9.25M financial losses, with only 19 cases publicly reported. By applying SMARTCAT as a real-time detector in Ethereum and Binance Smart Chain, it has raised 14 alarms 99 seconds after the corresponding deployment on average. These attacks have already led to $641K financial losses, and seven of them are still waiting for their ripe time.
Shangtong Cao, Ningyu He, Xinyu She, Yixuan Zhang, Mu Zhang, Haoyu Wang
Wasm runtime is a fundamental component in the Wasm ecosystem, as it directly impacts whether Wasm applications can be executed as expected. Bugs in Wasm runtime bugs are frequently reported, thus our research community has made a few attempts to design automated testing frameworks for detecting bugs in Wasm runtimes. However, existing testing frameworks are limited by the quality of test cases, i.e., they face challenges of generating both semantic-rich and syntactic-correct Wasm binaries, thus complicated bugs cannot be triggered. In this work, we present WRTester, a novel differential testing framework that can generated complicated Wasm test cases by disassembling and assembling of real-world Wasm binaries, which can trigger hidden inconsistencies among Wasm runtimes. For further pinpointing the root causes of unexpected behaviors, we design a runtime-agnostic root cause location method to accurately locate bugs. Extensive evaluation suggests that WRTester outperforms SOTA techniques in terms of both efficiency and effectiveness. We have uncovered 33 unique bugs in popular Wasm runtimes, among which 25 have been confirmed.