Showing 1–20 of 109 results
/ Date/ Name
Feb 17, 2023Practical Contextual Bandits with Feedback GraphsJun 28, 2024LCSim: A Large-Scale Controllable Traffic SimulatorJun 30, 2024Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret LearningNov 17, 2019The Secret Revealer: Generative Model-Inversion Attacks Against Deep Neural NetworksNov 24, 2023FRAD: Front-Running Attacks Detection on Ethereum using Ternary Classification ModelMar 3, 2025Statistical Tractability of Off-policy Evaluation of History-dependent Policies in POMDPsOct 4, 2022Improved High-Probability Regret for Adversarial Bandits with Time-Varying Feedback GraphsMar 17, 2022Submillimetre galaxies in two massive protoclusters at z = 2.24: witnessing the enrichment of extreme starbursts in the outskirts of HAE density peaksSep 11, 2020Improving Robustness to Model Inversion Attacks via Mutual Information RegularizationFeb 11, 2024Online Iterative Reinforcement Learning from Human Feedback with General Preference ModelMay 31, 2024Provably Efficient Interactive-Grounded Learning with Personalized RewardApr 8, 2026Beyond Pessimism: Offline Learning in KL-regularized GamesAug 7, 2020Convolutional Ordinal Regression Forest for Image Ordinal EstimationSep 30, 2025Improving Sampling Efficiency in RLVR through Adaptive Rollout and Response ReuseFeb 24, 2025Improving LLM General Preference Alignment via Optimistic Online Mirror DescentAug 30, 2025Near-Duplicate Text Alignment under Weighted Jaccard SimilarityFeb 27, 2026Beyond State-Wise Mirror Descent: Offline Policy Optimization with Parametric PoliciesOct 2, 2022Improved Algorithms for Neural Active LearningFeb 12, 2024Efficient Contextual Bandits with Uninformed Feedback GraphsFeb 6, 2023Offline Learning in Markov Games with General Function Approximation