Showing 1–20 of 30 results
/ Date/ Name
Oct 22, 2021Probabilistic fine-tuning of pruning masks and PAC-Bayes self-bounded learningFeb 19, 2020Robust Pruning at InitializationFeb 19, 2019On the Impact of the Activation Function on Deep Neural Networks TrainingOct 24, 2020Stable ResNetOct 3, 2022On the infinite-depth limit of finite-width neural networksOct 2, 2023Commutative Width and Depth Scaling in Deep Neural NetworksOct 22, 2021Feature Learning and Signal Propagation in Deep Neural NetworksAug 9, 2017Cleaning the correlation matrix with a denoising autoencoderJun 17, 2025Optimal Embedding Learning Rate in LLMs: The Effect of Vocabulary SizeFeb 11, 2026$μ$pscaling small models: Principled warm starts and hyperparameter transferMay 21, 2018On the Selection of Initialization and Activation Function for Deep Neural NetworksSep 17, 2023On the Connection Between Riemann Hypothesis and a Special Class of Neural NetworksOct 3, 2023Tensor Programs VI: Feature Learning in Infinite-Depth Neural NetworksSep 29, 2023Leave-one-out Distinguishability in Machine LearningFeb 1, 2023Width and Depth Limits Commute in Residual NetworksMay 31, 2019Exact Convergence Rates of the Neural Tangent Kernel in the Large Depth LimitAug 11, 2017On the overestimation of the largest eigenvalue of a covariance matrixFeb 14, 2023Data pruning and neural scaling laws: fundamental limitations of score-based algorithmsJun 6, 2021Regularization in ResNet with Stochastic DepthJun 12, 2024The Impact of Initialization on LoRA Finetuning Dynamics