Showing 1–20 of 64 results
/ Date/ Name
Dec 16, 2021NewsClaims: A New Benchmark for Claim Detection from News with Attribute KnowledgeOct 20, 2023Decoding the Silent Majority: Inducing Belief Augmented Social Graph with Large Language Model for Response ForecastingMay 23, 2023CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language ModelsFeb 16, 2024Persona-DB: Efficient Large Language Model Personalization for Response Prediction with Collaborative Data RefinementJun 30, 2025Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future FrontiersApr 14, 2025TAMP: Token-Adaptive Layerwise Pruning in Multimodal Large Language ModelsDec 9, 2025EcomBench: Towards Holistic Evaluation of Foundation Agents in E-commerceFeb 23, 2025MimeQA: Towards Socially-Intelligent Nonverbal Foundation ModelsMay 29, 2025MMBoundary: Advancing MLLM Knowledge Boundary Awareness through Reasoning Step Confidence CalibrationJul 27, 2025Diversity-Enhanced Reasoning for Subjective QuestionsFeb 26, 2026AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual ScenariosJul 1, 2020COVID-19 Literature Knowledge Graph Construction and Drug Repurposing Report GenerationMar 9, 2022A Weibo Dataset for the 2022 Russo-Ukrainian CrisisMay 29, 2023Enhanced Chart Understanding in Vision and Language Task via Cross-modal Pre-training on Plot Table PairsNov 16, 2023R-Tuning: Instructing Large Language Models to Say `I Don't Know'Apr 9, 2025Alice: Proactive Learning with Teacher's Demonstrations for Weak-to-Strong GeneralizationApr 23, 2025Unveiling the Lack of LVLM Robustness to Fundamental Visual Variations: Why and Path ForwardJun 6, 2025MATP-BENCH: Can MLLM Be a Good Automated Theorem Prover for Multimodal Problems?Jul 10, 2025DocCHA: Towards LLM-Augmented Interactive Online diagnosis SystemFeb 17, 2025VLM2-Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues