Showing 1–20 of 51 results
/ Date/ Name
May 26, 2018Fast Policy Learning through Imitation and ReinforcementNov 16, 2018RMPflow: A Computational Graph for Automatic Motion Policy GenerationFeb 5, 2022Adversarially Trained Actor Critic for Offline Reinforcement LearningMar 15, 2023PLEX: Making the Most of the Available Data for Robotic Manipulation PretrainingJun 12, 2018Accelerating Imitation Learning with Predictive ModelsMar 29, 2019Stable, Concurrent Controller Composition for Multi-Objective Robotic TasksNov 14, 2019A Reduction from Reinforcement Learning to No-Regret Online LearningMar 15, 2020Intra Order-preserving Functions for Calibration of Multi-Class Neural NetworksJul 13, 2022Hindsight Learning for MDPs with Exogenous InputsJul 6, 2020Explaining Fast Improvement in Online Imitation LearningFeb 19, 2019Online Learning with Continuous Variations: Dynamic Regret and ReductionsJun 13, 2021Bellman-consistent Pessimism for Offline Reinforcement LearningOct 15, 2018Predictor-Corrector Policy OptimizationOct 25, 2018Truncated Back-propagation for Bilevel OptimizationNov 8, 2022ARMOR: A Model-based Framework for Improving Arbitrary Baseline Policies with Offline DataApr 4, 2024Direct Nash Optimization: Teaching Language Models to Self-Improve with General PreferencesFeb 16, 2024PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in ControlJun 5, 2021Heuristic-Guided Reinforcement LearningMar 16, 2026POLCA: Stochastic Generative Optimization with LLMJan 6, 2023Provable Reset-free Reinforcement Learning by No-Regret Reduction