"au:"Boxi Cao"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Boxi Cao"" — arXiv2 Search

Showing 1–2 of 2 results

/ Date/ Name

Oct 24, 2025When Models Outthink Their Safety: Unveiling and Mitigating Self-Jailbreak in Large Reasoning Models Jun 3, 2024Towards Scalable Automated Alignment of LLMs: A Survey