"au:"Congliang Chen"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Congliang Chen"" — arXiv2 Search

Showing 1–17 of 17 results

/ Date/ Name

Mar 24, 2026Off-Policy Value-Based Reinforcement Learning for Large Language Models Mar 2, 2026Adam Converges Without Any Modification On Update Rules Oct 31, 2025ORGEval: Graph-Theoretic Evaluation of LLMs in Optimization Modeling Sep 30, 2025Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation May 27, 2025Rethinking Data Mixture for Large Language Models: A Comprehensive Survey and New Perspectives Feb 16, 2025AdaGC: Improving Training Stability for Large Language Model Pretraining Nov 25, 2024Exploring the Generalization Capabilities of AID-based Bi-level Optimization Aug 29, 2024Preserving Diversity in Supervised Fine-Tuning of Large Language Models Jun 24, 2024Adam-mini: Use Fewer Learning Rates To Gain More Feb 26, 2024Why Transformers Need Adam: A Hessian Perspective Oct 23, 2023Rethinking SIGN Training: Provable Nonconvex Acceleration without First- and Second-Order Gradient Lipschitz Aug 20, 2022Adam Can Converge Without Any Modification On Update Rules May 28, 2022Efficient-Adam: Communication-Efficient Distributed Adam Jan 14, 2021Towards Practical Adam: Non-Convexity, Convergence Theory, and Mini-Batch Acceleration Apr 29, 2020Quantized Adam with Error Feedback Aug 10, 2018A Unified Analysis of AdaGrad with Weighted Aggregation and Momentum Acceleration May 10, 2018Arbitrary Style Transfer with Deep Feature Reshuffle