Showing 1–20 of 35 results
/ Date/ Name
Jan 3, 2024Optimal cross-learning for contextual bandits with unknown context distributionsNov 11, 2024Beating Adversarial Low-Rank MDPs with Unknown Transition and Bandit FeedbackOct 6, 2021Efficient Methods for Online Multiclass Logistic RegressionJul 4, 2018Factored BanditsOct 14, 2019An Optimal Algorithm for Adversarial Bandits with Arbitrary DelaysAug 23, 2022A Provably Efficient Model-Free Posterior Sampling Method for Episodic Reinforcement LearningFeb 6, 2022Pushing the Efficiency-Regret Pareto Frontier for Online Learning of Portfolios and Quantum StatesMay 28, 2019Connections Between Mirror Descent, Thompson Sampling and the Information RatioOct 7, 2021A Model Selection Approach for Corruption Robust Reinforcement LearningJun 3, 2025Non-stationary Bandit Convex Optimization: A Comprehensive StudyOct 17, 2022A Unified Algorithm for Stochastic Path ProblemsFeb 20, 2023A Blackbox Approach to Best of Both Worlds in Bandits and BeyondFeb 18, 2023Best of Both Worlds Policy OptimizationJul 19, 2018Tsallis-INF: An Optimal Algorithm for Stochastic and Adversarial BanditsJul 12, 2021Adapting to Misspecification in Contextual BanditsFeb 4, 2025A Scalable Crawling Algorithm Utilizing Noisy Change-Indicating SignalsDec 10, 2025Contextual Dynamic Pricing with Heterogeneous BuyersJan 25, 2019Beating Stochastic and Adversarial Semi-bandits Optimally and SimultaneouslyOct 25, 2021The Pareto Frontier of model selection for general Contextual BanditsMay 10, 2024Incentive-compatible Bandits: Importance Weighting No More