"au:"Enyu Zhou"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Enyu Zhou"" — arXiv2 Search

Showing 1–20 of 21 results

/ Date/ Name

Oct 17, 2023RealBehavior: A Framework for Faithfully Characterizing Foundation Models' Human-like Behavior Mechanisms Aug 6, 2025CARD: A Cache-Assisted Parallel Speculative Decoding Framework via Query-and-Correct Paradigm for Accelerating LLM Inference Oct 13, 2024RMB: Comprehensively Benchmarking Reward Models in LLM Alignment Feb 4, 2026Steering LLMs via Scalable Interactive Oversight Mar 21, 2022Global Matching with Overlapping Attention for Optical Flow Estimation Feb 2, 2024StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback Oct 21, 2025BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping Aug 5, 2025VRPO: Rethinking Value Modeling for Robust RL Training under Noisy Supervision Apr 15, 2026MM-Doc-R1: Training Agents for Long Document Visual Question Answering through Multi-turn Reinforcement Learning Dec 15, 2023LoRAMoE: Alleviate World Knowledge Forgetting in Large Language Models via MoE-Style Plugin Jan 26, 2021Semi-synthesis: A fast way to produce effective datasets for stereo matching May 1, 2024MetaRM: Shifted Distributions Alignment via Meta-Learning Jun 26, 2024SafeAligner: Safety Alignment against Jailbreak Attacks via Response Disparity Guidance Sep 14, 2023The Rise and Potential of Large Language Model Based Agents: A Survey Jan 11, 2024Secrets of RLHF in Large Language Models Part II: Reward Modeling Jun 17, 2024Aligning Large Language Models from Self-Reference AI Feedback with one General Principle Jul 7, 2025Pre-Trained Policy Discriminators are General Reward Models Jan 19, 2026FRoM-W1: Towards General Humanoid Whole-Body Control with Language Instructions Dec 4, 2025Nex-N1: Agentic Models Trained via a Unified Ecosystem for Large-Scale Environment Construction Jun 30, 2025Why Reinforcement Fine-Tuning Enables MLLMs Preserve Prior Knowledge Better: A Data Perspective