Showing 1–14 of 14 results
/ Date/ Name
Jun 1, 2020Model-Based Reinforcement Learning with Value-Targeted RegressionMar 8, 2024Switching the Loss Reduces the Cost in Batch (Offline) Reinforcement LearningOct 27, 2025Learning to Reason Efficiently with Discounted Reinforcement LearningOct 1, 2025Rectifying Regression in Reinforcement LearningOct 1, 2024Almost Free: Self-concordance in Natural Exponential Families and an Application to BanditsOct 12, 2025Does Weighting Improve Matrix Factorization for Recommender Systems?Jan 14, 2026Eluder dimension: localise it!Aug 5, 2021An Elementary Proof that Q-learning Converges Almost SurelyJun 15, 2021Randomized Exploration for Reinforcement Learning with General Value Function ApproximationJan 29, 2026Efficient Simple Regret Algorithms for Stochastic Contextual BanditsJul 28, 2025Bernstein-type dimension-free concentration for self-normalised martingalesNov 13, 2023Exploration via linearly perturbed loss minimisationDec 17, 2022Managing Temporal Resolution in Continuous Value Estimation: A Fundamental Trade-offApr 23, 2026Revisiting Subgradient Dominance in Robust MDPs: Counterexamples, Hardness, and Sufficient Conditions