arXiv2
Search
Dark
/ Date
/ Name
Aa
W
/ Date
/ Name
"au:"Chuyi He"" — arXiv2 Search
Showing 1–4 of 4 results
/ Date
/ Name
Oct 19, 2024
On Designing Effective RL Reward at Training Time for LLM Reasoning
Aug 11, 2025
Beyond Ten Turns: Unlocking Long-Horizon Agentic Search with Large-Scale Asynchronous RL
May 30, 2025
AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning
Jan 30, 2026
From Self-Evolving Synthetic Data to Verifiable-Reward RL: Post-Training Multi-turn Interactive Tool-Using Agents