Showing 1–10 of 10 results
/ Date/ Name
Oct 9, 2024SAGE: Scalable Ground Truth Evaluations for Large Sparse AutoencodersDec 2, 2025Too Late to Recall: Explaining the Two-Hop Problem in Multimodal Knowledge RetrievalJun 13, 2025How Visual Representations Map to Language Feature Space in Multimodal LLMsJun 22, 2025Understanding Reasoning in Thinking Language Models via Steering VectorsJul 16, 2025Reasoning-Finetuning Repurposes Latent Representations in Base ModelsFeb 6, 2026Towards Understanding Multimodal Fine-Tuning: Spatial FeaturesAug 28, 2025Dyslexify: A Mechanistic Defense Against Typographic Attacks in CLIPMar 5, 2025Mixture of Experts Made Intrinsically InterpretableJan 27, 2026Sparse CLIP: Co-Optimizing Interpretability and Performance in Contrastive LearningOct 8, 2025Base Models Know How to Reason, Thinking Models Learn When