arXiv2
Search
Dark
/ Date
/ Name
Aa
W
/ Date
/ Name
"au:"David D. Baek"" — arXiv2 Search
Showing 1–9 of 9 results
/ Date
/ Name
Oct 10, 2024
The Geometry of Concepts: Sparse Autoencoder Feature Structure
Feb 8, 2024
GenEFT: Understanding Statics and Dynamics of Model Generalization via Effective Theory
Apr 25, 2025
Scaling Laws For Scalable Oversight
Oct 10, 2024
Investigating Representation Universality: Case Study on Genealogical Representations
Mar 5, 2025
Towards Understanding Distilled Reasoning Models: A Representational Approach
Feb 3, 2025
Harmonic Loss Trains Interpretable AI Models
Dec 8, 2022
Gate Error Analysis of Tunable Coupling Architecture in the Large-scale Superconducting Quantum System
Feb 26, 2026
A Decision-Theoretic Formalisation of Steganography With Applications to LLM Monitoring
Oct 20, 2025
Any-Depth Alignment: Unlocking Innate Safety Alignment of LLMs to Any-Depth