"au:"Eli Tran-Johnson"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Eli Tran-Johnson"" — arXiv2 Search

Showing 1–7 of 7 results

/ Date/ Name

Nov 4, 2022Measuring Progress on Scalable Oversight for Large Language Models Feb 15, 2023The Capacity for Moral Self-Correction in Large Language Models Aug 23, 2022Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned Dec 15, 2022Constitutional AI: Harmlessness from AI Feedback Oct 20, 2023Specific versus General Principles for Constitutional AI Jul 11, 2022Language Models (Mostly) Know What They Know Dec 19, 2022Discovering Language Model Behaviors with Model-Written Evaluations