Showing 21–34 of 34 results
/ Date/ Name
Mar 19, 2024Yell At Your Robot: Improving On-the-Fly from Language CorrectionsOct 23, 2023Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for Autonomous Real-World Reinforcement LearningApr 22, 2024Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy DataJul 7, 2025Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic CapabilitiesOct 12, 2023Offline Retraining for Online RL: Decoupled Policy Learning to Mitigate Exploration BiasOct 19, 2022When to Ask for Help: Proactive Interventions in Autonomous Reinforcement LearningSep 15, 2024Towards Data-Centric RLHF: Simple Metrics for Preference Dataset ComparisonOct 30, 2024Grounding by Trying: LLMs with Reinforcement Learning-Enhanced RetrievalOct 19, 2023An Emulator for Fine-Tuning Large Language Models using Small Language ModelsApr 1, 2024Stream of Search (SoS): Learning to Search in LanguageDec 9, 2024Policy Agnostic RL: Offline RL and Online RL Fine-Tuning of Any Class and BackboneOct 13, 2023Open X-Embodiment: Robotic Learning Datasets and RT-X ModelsFeb 16, 2024RLVF: Learning from Verbal Feedback without OvergeneralizationJun 2, 2021Variational Empowerment as Representation Learning for Goal-Based Reinforcement Learning