Showing 21–30 of 30 results
/ Date/ Name
Feb 19, 2024LoRA+: Efficient Low Rank Adaptation of Large ModelsJun 25, 2025PLoP: Precise LoRA Placement for Efficient Finetuning of Large ModelsFeb 5, 2026Learning Rate Scaling across LoRA Ranks and Transfer to Full FinetuningNov 3, 2025A Proof of Learning Rate Transfer under $μ$PFeb 22, 2022From Optimization Dynamics to Generalization Bounds via Łojasiewicz Gradient InequalityApr 7, 2024How Bad is Training on Synthetic Data? A Statistical Analysis of Language Model CollapseOct 11, 2024Maximizing the Potential of Synthetic Data: Insights from Random Matrix TheoryOct 5, 2024Decoupling Dynamical Richness from Representation Learning: Towards Practical MeasurementApr 10, 2026The Myth of Expert Specialization in MoEs: Why Routing Reflects Geometry, Not Necessarily Domain ExpertiseJun 10, 2025On the Stability of the Jacobian Matrix in Deep Neural Networks