"au:"Xander Davies"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Xander Davies"" — arXiv2 Search

Showing 1–3 of 3 results

/ Date/ Name

Oct 8, 2025Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples Aug 8, 2025Deep Ignorance: Filtering Pretraining Data Builds Tamper-Resistant Safeguards into Open-Weight LLMs Jul 27, 2023Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback