Showing 1–20 of 54 results
/ Date/ Name
Oct 15, 2019Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance SamplingOct 25, 2020XLVIN: eXecuted Latent Value Iteration NetsMar 10, 2021An Information-Theoretic Perspective on Credit Assignment in Reinforcement LearningSep 16, 2016The Option-Critic ArchitectureJul 6, 2020TDprop: Does Jacobi Preconditioning Help Temporal Difference Learning?Feb 26, 2020Policy Evaluation NetworksJan 1, 2020Options of Interest: Temporal Abstraction with Interest FunctionsJun 6, 2021Control-Oriented Model-Based Reinforcement Learning with Implicit DifferentiationDec 22, 2021Direct Behavior Specification via Constrained Reinforcement LearningSep 26, 2020Graph neural induction of value iterationOct 11, 2021Neural Algorithmic Reasoners are Implicit PlannersDec 3, 2016A Matrix Splitting Perspective on Planning with OptionsSep 14, 2017When Waiting is not an Option : Learning Options with a Deliberation CostOct 23, 2023Course Correcting Koopman RepresentationsJun 7, 2023Goal-conditioned GFlowNets for Controllable Multi-Objective Molecular DesignDec 11, 2024MaestroMotif: Skill Design from Artificial Intelligence FeedbackFeb 8, 2025Mol-MoE: Training Preference-Guided Routers for Molecule GenerationJun 8, 2025State Entropy Regularization for Robust Reinforcement LearningOct 1, 2025The Three Regimes of Offline-to-Online Reinforcement LearningMay 16, 2022The Primacy Bias in Deep Reinforcement Learning