arXiv2
Search
Dark
/ Date
/ Name
Aa
W
/ Date
/ Name
"au:"Jierui Zuo"" — arXiv2 Search
Showing 1–2 of 2 results
/ Date
/ Name
Apr 13, 2026
DDO-RM for LLM Preference Optimization: A Minimal Held-Out Benchmark against DPO
Jan 31, 2025
On Pareto Optimality for Parametric Choice Bandits