Showing 1–17 of 17 results
/ Date/ Name
Mar 24, 2026Off-Policy Value-Based Reinforcement Learning for Large Language ModelsMar 2, 2026Adam Converges Without Any Modification On Update RulesOct 31, 2025ORGEval: Graph-Theoretic Evaluation of LLMs in Optimization ModelingSep 30, 2025Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget AllocationMay 27, 2025Rethinking Data Mixture for Large Language Models: A Comprehensive Survey and New PerspectivesFeb 16, 2025AdaGC: Improving Training Stability for Large Language Model PretrainingNov 25, 2024Exploring the Generalization Capabilities of AID-based Bi-level OptimizationAug 29, 2024Preserving Diversity in Supervised Fine-Tuning of Large Language ModelsJun 24, 2024Adam-mini: Use Fewer Learning Rates To Gain MoreFeb 26, 2024Why Transformers Need Adam: A Hessian PerspectiveOct 23, 2023Rethinking SIGN Training: Provable Nonconvex Acceleration without First- and Second-Order Gradient LipschitzAug 20, 2022Adam Can Converge Without Any Modification On Update RulesMay 28, 2022Efficient-Adam: Communication-Efficient Distributed AdamJan 14, 2021Towards Practical Adam: Non-Convexity, Convergence Theory, and Mini-Batch AccelerationApr 29, 2020Quantized Adam with Error FeedbackAug 10, 2018A Unified Analysis of AdaGrad with Weighted Aggregation and Momentum AccelerationMay 10, 2018Arbitrary Style Transfer with Deep Feature Reshuffle