"au:"Chang Zhou"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Chang Zhou"" — arXiv2 Search

Showing 1–3 of 3 results

/ Date/ Name

Aug 20, 2024Predicting Rewards Alongside Tokens: Non-disruptive Parameter Insertion for Efficient Inference Intervention in Large Language Model Jun 19, 2024Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models May 28, 2024Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment