Showing 1–20 of 26 results
/ Date/ Name
Apr 22, 2026Too Sharp, Too Sure: When Calibration Follows CurvatureApr 15, 2026Momentum Further Constrains Sharpness at the Edge of Stochastic StabilitySep 30, 2025Hierarchical Reasoning Models: Perspectives and MisconceptionsMay 4, 2025Heterosynaptic Circuits Are Universal Gradient MachinesFeb 7, 2025Parameter Symmetry Potentially Unifies Deep Learning TheoryDec 28, 2024Self-Assembly of a Biologically Plausible Learning CircuitOct 3, 2024Formation of Representations in Neural NetworksDec 31, 2020Explicit regularization and implicit bias in deep network classifiers trained with the square lossJun 24, 2020Hierarchically Compositional Tasks and Deep Convolutional NetworksAug 25, 2019Theoretical Issues in Deep Networks: Approximation, Optimization and GeneralizationMar 12, 2019Theory III: Dynamics and Generalization in Deep NetworksNov 8, 2018Biologically-plausible learning algorithms can scale to large datasetsJul 25, 2018A Surprising Linear Relationship Predicts Test Performance in Deep NetworksJun 29, 2018Theory IIIb: Generalization in Deep NetworksJan 7, 2018Theory of Deep Learning IIb: Optimization Properties of SGDDec 30, 2017Theory of Deep Learning III: explaining the non-overfitting puzzleMar 28, 2017Theory II: Landscape of the Empirical Risk in Deep LearningJan 18, 2017Compression of Deep Neural Networks for Image Instance RetrievalNov 2, 2016Why and When Can Deep -- but Not Shallow -- Networks Avoid the Curse of Dimensionality: a ReviewOct 19, 2016Streaming Normalization: Towards Simpler and More Biologically-plausible Normalizations for Online and Recurrent Learning