Showing 21–37 of 37 results
/ Date/ Name
Dec 8, 2021Ethical and social risks of harm from Language ModelsMar 31, 2024A Review of Modern Recommender Systems Using Generative Models (Gen-RecSys)Mar 22, 2024Explanation Hacking: The perils of algorithmic recourseOct 17, 2024Ethics Whitepaper: Whitepaper on Ethical Research into Large Language ModelsApr 15, 2024Foundational Challenges in Assuring Alignment and Safety of Large Language ModelsSep 29, 2025Generative Value Conflicts Reveal LLM PrioritiesSep 5, 2024Beyond Model Interpretability: Socio-Structural Explanations in Machine LearningAug 21, 2024Epistemic Injustice in Generative AIAug 15, 2024The Future of Open Human FeedbackOct 7, 2025EVALUESTEER: Measuring Reward Model Steerability Towards Values and PreferencesFeb 19, 2025Multi-Agent Risks from Advanced AIFeb 24, 2026International AI Safety Report 2026Dec 3, 2025Full-Stack Alignment: Co-Aligning AI and Institutions with Thick Models of ValueNov 24, 2024A Taxonomy of Systemic Risks from General-Purpose AIDec 20, 2024The Only Way is Ethics: A Guide to Ethical Research with Large Language ModelsSep 10, 2025The More You Automate, the Less You See: Hidden Pitfalls of AI Scientist SystemsAug 10, 2025Position: Beyond Sensitive Attributes, ML Fairness Should Quantify Structural Injustice via Social Determinants