arXiv2
Search
Dark
/ Date
/ Name
Aa
W
/ Date
/ Name
"au:"Hongyu Lin"" — arXiv2 Search
Showing 1–5 of 5 results
/ Date
/ Name
Oct 24, 2025
When Models Outthink Their Safety: Unveiling and Mitigating Self-Jailbreak in Large Reasoning Models
Jul 20, 2025
RefCritic: Training Long Chain-of-Thought Critic Models with Refinement Feedback
Oct 28, 2024
Transferable Post-training via Inverse Value Learning
Jun 3, 2024
Towards Scalable Automated Alignment of LLMs: A Survey
Feb 27, 2024
SoFA: Shielded On-the-fly Alignment via Priority Rule Following