"au:"Ziniu Li"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Ziniu Li"" — arXiv2 Search

Showing 1–8 of 8 results

/ Date/ Name

Mar 24, 2026Off-Policy Value-Based Reinforcement Learning for Large Language Models Jan 9, 2026The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning Oct 31, 2025ORGEval: Graph-Theoretic Evaluation of LLMs in Optimization Modeling Sep 30, 2025Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation May 16, 2025Stepwise Guided Policy Optimization: Coloring your Incorrect Reasoning in GRPO Aug 29, 2024Preserving Diversity in Supervised Fine-Tuning of Large Language Models Jun 24, 2024Adam-mini: Use Fewer Learning Rates To Gain More Feb 26, 2024Why Transformers Need Adam: A Hessian Perspective