"au:"Archit Sharma"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Archit Sharma"" — arXiv2 Search

Showing 1–20 of 34 results

/ Date/ Name

May 29, 2023Direct Preference Optimization: Your Language Model is Secretly a Reward Model Jul 2, 2019Dynamics-Aware Unsupervised Discovery of Skills Apr 27, 2020Emergent Real-World Robotic Skills via Unsupervised Off-Policy Reinforcement Learning May 11, 2022A State-Distribution Matching Approach to Non-Episodic Reinforcement Learning Mar 2, 2023Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement Learning Feb 19, 2024A Critical Evaluation of AI Feedback for Aligning Large Language Models Feb 10, 2026Instruct2Act: From Human Instruction to Actions Sequencing and Execution via Robot Action Network for Robotic Manipulation Jul 27, 2021Autonomous Reinforcement Learning via Subgoal Curricula Feb 10, 2026RoboSubtaskNet: Temporal Sub-task Segmentation for Human-to-Robot Skill Transfer in Real-World Environments May 3, 2018TrueChain: Highly Performant Decentralized Public Ledger Dec 17, 2021Autonomous Reinforcement Learning: Formalism and Benchmarking Mar 24, 2021Discriminator Augmented Model-Based Reinforcement Learning Mar 19, 2024DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset Jan 29, 2024SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning Nov 2, 2023Adapt On-the-Go: Behavior Modulation for Single-Life Robot Deployment Feb 26, 2025FSPO: Few-Shot Optimization of Synthetic Preferences Personalizes to Real Users Dec 11, 2024Test-Time Alignment via Hypothesis Reweighting Oct 17, 2022You Only Live Once: Single-Life Reinforcement Learning Jul 26, 2023Waypoint-Based Imitation Learning for Robotic Manipulation May 24, 2023Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback