"au:"Yuheng Zhang"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Yuheng Zhang"" — arXiv2 Search

Showing 1–20 of 109 results

/ Date/ Name

Feb 17, 2023Practical Contextual Bandits with Feedback Graphs Jun 28, 2024LCSim: A Large-Scale Controllable Traffic Simulator Jun 30, 2024Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning Nov 17, 2019The Secret Revealer: Generative Model-Inversion Attacks Against Deep Neural Networks Nov 24, 2023FRAD: Front-Running Attacks Detection on Ethereum using Ternary Classification Model Mar 3, 2025Statistical Tractability of Off-policy Evaluation of History-dependent Policies in POMDPs Oct 4, 2022Improved High-Probability Regret for Adversarial Bandits with Time-Varying Feedback Graphs Mar 17, 2022Submillimetre galaxies in two massive protoclusters at z = 2.24: witnessing the enrichment of extreme starbursts in the outskirts of HAE density peaks Sep 11, 2020Improving Robustness to Model Inversion Attacks via Mutual Information Regularization Feb 11, 2024Online Iterative Reinforcement Learning from Human Feedback with General Preference Model May 31, 2024Provably Efficient Interactive-Grounded Learning with Personalized Reward Apr 8, 2026Beyond Pessimism: Offline Learning in KL-regularized Games Aug 7, 2020Convolutional Ordinal Regression Forest for Image Ordinal Estimation Sep 30, 2025Improving Sampling Efficiency in RLVR through Adaptive Rollout and Response Reuse Feb 24, 2025Improving LLM General Preference Alignment via Optimistic Online Mirror Descent Aug 30, 2025Near-Duplicate Text Alignment under Weighted Jaccard Similarity Feb 27, 2026Beyond State-Wise Mirror Descent: Offline Policy Optimization with Parametric Policies Oct 2, 2022Improved Algorithms for Neural Active Learning Feb 12, 2024Efficient Contextual Bandits with Uninformed Feedback Graphs Feb 6, 2023Offline Learning in Markov Games with General Function Approximation