"au:"Stephen Casper"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Stephen Casper"" — arXiv2 Search

Showing 1–6 of 6 results

/ Date/ Name

Jan 17, 2026Expanding External Access To Frontier AI Models For Dangerous Capability Evaluations Aug 8, 2025Deep Ignorance: Filtering Pretraining Data Builds Tamper-Resistant Safeguards into Open-Weight LLMs Apr 15, 2024Foundational Challenges in Assuring Alignment and Safety of Large Language Models Nov 6, 2023Scalable and Transferable Black-Box Jailbreaks for Language Models via Persona Modulation Jul 27, 2023Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback Mar 4, 2021Clusterability in Neural Networks