Showing 21–40 of 42 results
/ Date/ Name
May 25, 2021From Motor Control to Team Play in Simulated Humanoid FootballJan 2, 2018DeepMind Control SuiteJul 9, 2025Value from Observations: Towards Large-Scale Imitation Learning via Self-ImprovementOct 12, 2020Local Search for Policy Iteration in Continuous ControlMay 15, 2020A Distributional View on Multi-Objective Policy OptimizationOct 9, 2019Imagined Value Gradients: Model-Based Policy Optimization with Transferable Latent Dynamics ModelsSep 26, 2019V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous ControlJan 2, 2020Continuous-Discrete Reinforcement Learning for Hybrid Control in RoboticsJul 30, 2020Data-efficient Hindsight Off-policy Option LearningFeb 12, 2019Value constrained model-free continuous controlJun 29, 2016Model-Free Trajectory-based Policy Optimization with Monotonic ImprovementMay 22, 2017Guide Actor-Critic for Continuous ControlFeb 8, 2024Offline Actor-Critic Reinforcement Learning Scales to Large ModelsFeb 24, 2023Leveraging Jumpy Models for Planning and Fast Learning in Robotic DomainsOct 2, 2025Gemini Robotics 1.5: Pushing the Frontier of Generalist Robots with Advanced Embodied Reasoning, Thinking, and Motion TransferJun 20, 2023RoboCat: A Self-Improving Generalist Agent for Robotic ManipulationAug 29, 2023Policy composition in reinforcement learning via multi-objective policy optimizationApr 21, 2022Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approachNov 5, 2019Quinoa: a Q-function You Infer Normalized Over ActionsFeb 13, 2019Simultaneously Learning Vision and Feature-based Control Policies for Real-world Ball-in-a-Cup