Showing 1–20 of 55 results
/ Date/ Name
May 27, 2019Learning step sizes for unfolded sparse codingFeb 10, 2020Super-efficiency of automatic differentiation for functions defined as a minimumJun 25, 2017Faster independent component analysis by preconditioning with Hessian approximationsFeb 3, 2025Soup-of-Experts: Pretraining Specialist Models via Parameters AveragingAug 21, 2020Spectral independent component analysis with noise modeling for M/EEG source separationNov 27, 2020Deep orthogonal linear networks are shallowMay 25, 2018Stochastic algorithms with descent guarantees for ICAOct 3, 2024Dynamic Gradient Alignment for Online Data MixingJul 12, 2025Scaling Laws for Optimal Data MixturesOct 26, 2023A Challenge in Reweighting Data with Bilevel OptimizationFeb 15, 2021Fast and accurate optimization on the orthogonal manifold without retractionNov 29, 2017Faster ICA under orthogonal constraintFeb 2, 2024Need a Small Specialized Language Model? Plan Early!Oct 22, 2021Sinkformers: Transformers with Doubly Stochastic AttentionMar 29, 2023Infeasible Deterministic, Stochastic, and Variance-Reduction Algorithms for Optimization under Orthogonality ConstraintsFeb 5, 2024Careful with that Scalpel: Improving Gradient Surgery with an EMAMay 25, 2020mvlearn: Multiview Machine Learning in PythonNov 28, 2018Beyond Pham's algorithm for joint diagonalizationSep 5, 2024The AdEMAMix Optimizer: Better, Faster, OlderMay 2, 2024Optimization without Retraction on the Random Generalized Stiefel Manifold