Daniel Luo, Alexander Wolitzky
We study reputation formation where a long-run player repeatedly observes private signals and takes actions. Short-run players observe the long-run player's past actions but not her past signals. The long-run player can thus develop a reputation for playing a distribution over actions, but not necessarily for playing a particular mapping from signals to actions. Nonetheless, we show that the long-run player can secure her Stackelberg payoff if distinct commitment types are statistically distinguishable and the Stackelberg strategy is confound-defeating. This property holds if and only if the Stackelberg strategy is the unique solution to an optimal transport problem. If the long-run player's payoff is supermodular in one-dimensional signals and actions, she secures the Stackelberg payoff if and only if the Stackelberg strategy is monotone. Applications include deterrence, delegation, signaling, and persuasion. Our results extend to the case where distinct commitment types may be indistinguishable but the Stackelberg type is salient under the prior.
Daniel Luo
I study reputation formation in repeated games where player actions endogenously determine the probability the game permanently ends. Permanent exit can render reputation useless even to a patient long-lived player whose actions are perfectly monitored, in stark contrast to canonical commitment payoff theorems. However, I identify tight conditions for the long-run player to attain their Stackelberg payoff in all Markov equilibrium. Along the way, I highlight the role of Markov strategies in pinning down the value of reputation formation. I apply my results to give conditional commitment foundations for the infinite chain-store game. I also analyze repeated global games with exit, and obtain new predictions about regime survival.
Daniel Luo
What is the optimal order in which a researcher should submit their papers to journals of differing quality? I analyze a sequential search model without recall where the researcher's expected value from journal submission depends on the history of past submissions. Acceptances immediately terminate the search process and deliver some payoff, while rejections carry information about the paper's quality, affecting the researcher's belief in acceptance probability over future journals. When journal feedback does not change the paper's quality, the researcher's optimal strategy is monotone in their acceptance payoff. Submission costs distort the researcher's effective acceptance payoff, but maintain monotone optimality. If journals give feedback which can affect the paper's quality, such as through \textit{referee reports}, the search order can change drastically depending on the agent's prior belief about their paper's quality. However, I identify a set of \textit{assortative matched} conditions on feedback such that monotone strategies remain optimal whenever the agent's prior is sufficiently optimistic.
Eric Gao, Daniel Luo
We study economies where consumers interact independently with many monopolists. When consumer valuations over goods are correlated, correlation can distort the induced distribution of consumer surplus (information rents). We identify which shifts in the correlation structure over values makes the induced distribution more or less fair, in the sense of second order stochastic dominance. We then investigate the role taxation can have on information rents, and show the tax authority never benefits from randomizing the allocation of goods. We characterize the set of mechanisms that are on the fairness-efficiency frontier under regularity conditions on the distribution of types. Furthermore, under these conditions all allocations on the fairness-efficiency frontier ration the good more than an unregulated monopolist. Finally, we discuss implications of our model for luxury commodity taxation.
Daniel Luo
I study dynamic contracting where Sender privately observes a Markovian state and seeks to motivate Receiver, who acts. Sender provides incentives in two ways: payments, which alter payoffs ex-post, and (Bayesian) persuasion, which shapes Receiver interim beliefs about payoffs. For all stage game payoffs, discount rates, and Markov transition rules, transfers are a last resort-there is an optimal contract where payments occur only after Sender commits to reveal the state at every continuation history. In an example, the optimal contract is a loyalty program: Sender chooses the static optimal information structure until a random promotion time, after which Sender reveals the state and pays Receiver.
Clément Bonnet, Daniel Luo, Donal Byrne, Shikha Surana, Sasha Abramowitz, Paul Duckworth, Vincent Coyette, Laurence I. Midgley, Elshadai Tegegn, Tristan Kalloniatis, Omayma Mahjoub, Matthew Macfarlane, Andries P. Smit, Nathan Grinsztajn, Raphael Boige, Cemlyn N. Waters, Mohamed A. Mimouni, Ulrich A. Mbou Sob, Ruan de Kock, Siddarth Singh, Daniel Furelos-Blanco, Victor Le, Arnu Pretorius, Alexandre Laterre
Open-source reinforcement learning (RL) environments have played a crucial role in driving progress in the development of AI algorithms. In modern RL research, there is a need for simulated environments that are performant, scalable, and modular to enable their utilization in a wider range of potential real-world applications. Therefore, we present Jumanji, a suite of diverse RL environments specifically designed to be fast, flexible, and scalable. Jumanji provides a suite of environments focusing on combinatorial problems frequently encountered in industry, as well as challenging general decision-making tasks. By leveraging the efficiency of JAX and hardware accelerators like GPUs and TPUs, Jumanji enables rapid iteration of research ideas and large-scale experimentation, ultimately empowering more capable agents. Unlike existing RL environment suites, Jumanji is highly customizable, allowing users to tailor the initial state distribution and problem complexity to their needs. Furthermore, we provide actor-critic baselines for each environment, accompanied by preliminary findings on scaling and generalization scenarios. Jumanji aims to set a new standard for speed, adaptability, and scalability of RL environments.
Eric Gao, Daniel Luo
We study the design of information acquisition games-environments where a designer contracts their action on Sender's choice of experiment and the realized signals about some state-and identify which predictions can be made absent knowledge about the prior. To do so, we characterize robust mechanisms: those which induce the same allocation rule (mappings from the state to actions) for all priors. These mechanisms take a simple form: they (1) incentivize fully revealing experiments, (2) depend only on the induced posterior, and (3) maximally punish pooling deviations. In binary action problems, all (and only) ordinally monotone allocation rules are robust. We apply our model to school choice and uncover a novel informational justification for deferred acceptance when school preferences depend on students' unknown ability. For general good allocation problems, we show all efficient allocations are robust, even when agent preferences feature state-dependent outside options and allocation externalities.
Elliott Wen, Sean Ma, Ewan Tempero, Jens Dietrich, Daniel Luo, Jiaxing Shen, Kaiqi Zhao, Bruce Sham, Yousong Song, Jiayi Hua, Jia Hong
While NVIDIA remains the dominant provider of AI accelerators within cloud data center, emerging vendors such as AMD, Intel, Mac, and Huawei offer cost-effective alternatives with claims of compatibility and performance. This paper presents the first empirical study investigating divergence in machine learning model across heterogeneous AI accelerators. Utilizing an automated pipeline, we synthesize over 100,000 variant models derived from 4,000 real-world models and execute them across five different enterprise-grade accelerators. Our findings suggest that newer AI platforms from Mac and Huawei support at least 17\% fewer operators than NVIDIA. These platforms also exhibit a higher rate of output discrepancies (exceeding 5\%), which stem from differences in operator implementations, handling of exceptional numerical values, and instruction scheduling. They are also more susceptible to failures during model compilation-based acceleration, and in some cases, the compiled models produce outputs that differ noticeably from those generated using the standard execution mode. In addition, we identify 7 implementation flaws in PyTorch and 40 platform-specific issues across vendors. These results underscore the challenges of achieving consistent machine learning behavior in an increasingly diverse hardware ecosystem.
Jiachi Chen, Xin Xia, David Lo, John Grundy, Daniel Xiapu Luo, Ting Chen
Smart contracts are programs running on a blockchain. They are immutable to change, and hence can not be patched for bugs once deployed. Thus it is critical to ensure they are bug-free and well-designed before deployment. A Contract defect is an error, flaw or fault in a smart contract that causes it to produce an incorrect or unexpected result, or to behave in unintended ways. The detection of contract defects is a method to avoid potential bugs and improve the design of existing code. Since smart contracts contain numerous distinctive features, such as the gas system. decentralized, it is important to find smart contract specified defects. To fill this gap, we collected smart-contract-related posts from Ethereum StackExchange, as well as real-world smart contracts. We manually analyzed these posts and contracts; using them to define 20 kinds of contract defects. We categorized them into indicating potential security, availability, performance, maintainability and reusability problems. To validate if practitioners consider these contract as harmful, we created an online survey and received 138 responses from 32 different countries. Feedback showed these contract defects are harmful and removing them would improve the quality and robustness of smart contracts. We manually identified our defined contract defects in 587 real world smart contract and publicly released our dataset. Finally, we summarized 5 impacts caused by contract defects. These help developers better understand the symptoms of the defects and removal priority.
Zeliang Kan, Haoyu Wang, Lei Wu, Yao Guo, Daniel Xiapu Luo
With the popularity of Android apps, different techniques have been proposed to enhance app protection. As an effective approach to prevent reverse engineering, obfuscation can be used to serve both benign and malicious purposes. In recent years, more and more sensitive logic or data have been implemented as obfuscated native code because of the limitations of Java bytecode. As a result, native code obfuscation becomes a great obstacle for security analysis to understand the complicated logic. In this paper, we propose DiANa, an automated system to facilitate the deobfuscation of native binary code in Android apps. Specifically, given a binary obfuscated by Obfuscator-LLVM (the most popular native code obfuscator), DiANa is capable of recovering the original Control Flow Graph. To the best of our knowledge, DiANa is the first system that aims to tackle the problem of Android native binary deobfuscation. We have applied DiANa in different scenarios, and the experimental results demonstrate the effectiveness of DiANa based on generic similarity comparison metrics.