Showing 1–16 of 16 results
/ Date/ Name
Oct 30, 2025AMO-Bench: Large Language Models Still Struggle in High School Math CompetitionsMay 23, 2023Skill-Based Few-Shot Selection for In-Context LearningFeb 23, 2023Does Deep Learning Learn to Abstract? A Systematic Probing FrameworkJul 14, 2021Learning Algebraic Recombination for Compositional GeneralizationMay 8, 2023How Do In-Context Examples Affect Compositional Generalization?Apr 13, 2026General365: Benchmarking General Reasoning in Large Language Models Across Diverse and Challenging TasksMar 7, 2022Input-Tuning: Adapting Unfamiliar Inputs to Frozen Pretrained ModelsApr 25, 2024Make Your LLM Fully Utilize the ContextOct 31, 2023Learning From Mistakes Makes LLM Better ReasonerMar 22, 2026LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement LearningJun 18, 2020Compositional Generalization by Learning Analytical ExpressionsFeb 29, 2024Compositional API Recommendation for Library-Oriented Code GenerationNov 7, 2024STAND-Guard: A Small Task-Adaptive Content Moderation ModelDec 19, 2024Dehallucinating Parallel Context Extension for Retrieval-Augmented GenerationDec 26, 2024Repository Structure-Aware Training Makes SLMs Better Issue ResolverJan 23, 2026LongCat-Flash-Thinking-2601 Technical Report