arXiv2
Search
Dark
/ Date
/ Name
Aa
W
/ Date
/ Name
"au:"Xander Davies"" — arXiv2 Search
Showing 1–3 of 3 results
/ Date
/ Name
Oct 8, 2025
Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples
Aug 8, 2025
Deep Ignorance: Filtering Pretraining Data Builds Tamper-Resistant Safeguards into Open-Weight LLMs
Jul 27, 2023
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback