arXiv2
Search
Dark
/ Date
/ Name
Aa
W
/ Date
/ Name
"au:"Chang Zhou"" — arXiv2 Search
Showing 1–3 of 3 results
/ Date
/ Name
Aug 20, 2024
Predicting Rewards Alongside Tokens: Non-disruptive Parameter Insertion for Efficient Inference Intervention in Large Language Model
Jun 19, 2024
Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models
May 28, 2024
Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment