Showing 1–15 of 15 results
/ Date/ Name
Dec 13, 2024AniSora: Exploring the Frontiers of Animation Video Generation in the Sora EraJan 30, 2026Spectra: Rethinking Optimizers for LLMs Under Spectral AnisotropyAug 30, 2025Metis: Training LLMs with FP4 QuantizationDec 17, 2024LLMCL-GEC: Advancing Grammatical Error Correction with LLM-Driven Curriculum LearningSep 22, 2025LIMI: Less is More for AgencyJul 18, 2025DistFlow: A Fully Distributed RL Framework for Scalable and Efficient LLM Post-TrainingJan 7, 2019Tencent ML-Images: A Large-Scale Multi-Label Image Database for Visual Representation LearningOct 1, 2025An Efficient, Reliable and Observable Collective Communication Library in Large-scale GPU Training ClustersFeb 13, 2026Multi-Head Attention as a Source of Catastrophic Forgetting in MoE TransformersFeb 13, 2026SD-MoE: Spectral Decomposition for Effective Expert SpecializationMar 13, 2026daVinci-Env: Open SWE Environment Synthesis at ScaleMar 28, 2026daVinci-LLM:Towards the Science of PretrainingNov 19, 2025SRPO: Self-Referential Policy Optimization for Vision-Language-Action ModelsMar 11, 2026The Curse and Blessing of Mean Bias in FP4-Quantized LLM TrainingOct 9, 2024OPTIMA: Optimized Policy for Intelligent Multi-Agent Systems Enables Coordination-Aware Autonomous Vehicles