Showing 1–19 of 19 results
/ Date/ Name
Jan 24, 2025RealCritic: Towards Effectiveness-Driven Evaluation of Language Model CritiquesJan 10, 2025Self-Evolving Critique Abilities in Large Language ModelsMar 5, 2024MathScale: Scaling Instruction Tuning for Mathematical ReasoningMay 28, 2024ORLM: A Customizable Framework in Training Large Models for Automated Optimization ModelingOct 5, 2025CALM Before the STORM: Unlocking Native Reasoning for Optimization ModelingApr 1, 2026Do Phone-Use Agents Respect Your Privacy?Mar 23, 2023Modular Retrieval for Generalization and InterpretationDec 16, 2024Second Language (Arabic) Acquisition of LLMs via Progressive Vocabulary ExpansionOct 31, 2023Perturbing Masses: A Study of Centered Co-Circular Configurations in Power-Law n-Body ProblemsMay 12, 2025Learning from Peers in Reasoning ModelsFeb 2, 2026Kimi K2.5: Visual Agentic IntelligenceApr 30, 2026Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World WorkflowsOct 23, 2025Teaching Language Models to Reason with ToolsJun 11, 2025CoRT: Code-integrated Reasoning within ThinkingApr 17, 2024On the Nonexistence of Centered Co-Circular Central Configurations With Three Unequal massesOct 10, 2024Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language ModelsApr 17, 2026Cut Your Losses! Learning to Prune Paths Early for Efficient Parallel ReasoningAug 24, 2022DPTDR: Deep Prompt Tuning for Dense Passage RetrievalFeb 20, 2024Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models