Showing 1–10 of 10 results
/ Date/ Name
Jan 29, 2026Failing to Explore: Language Models on Interactive TasksJan 28, 2026Active Learning for Decision Trees with Provable GuaranteesOct 27, 2025Your LLM Agents are Temporally Blind: The Misalignment Between Tool Use Decisions and Human Time PerceptionSep 29, 2025GHOST: Hallucination-Inducing Image Generation for Multimodal LLMsApr 9, 2025Kaleidoscope: In-language Exams for Massively Multilingual Vision EvaluationJan 28, 2025RODEO: Robust Outlier Detection via Exposing Adaptive Out-of-Distribution SamplesNov 29, 2024INCLUDE: Evaluating Multilingual Language Understanding with Regional KnowledgeOct 15, 2023Seeking Next Layer Neurons' Attention for Error-Backpropagation-Like Training in a Multi-Agent Network FrameworkOct 6, 2023SPADE: Sparsity-Guided Debugging for Deep Neural NetworksSep 30, 2022Your Out-of-Distribution Detection Method is Not Robust!