Showing 1–20 of 83 results
/ Date/ Name
May 2, 2023Revisiting Gradient Clipping: Stochastic bias and tight convergence guaranteesFeb 1, 2019Decentralized Stochastic Optimization and Gossip Algorithms with Compressed CommunicationNov 3, 2020A Linearly Convergent Algorithm for Decentralized Optimization: Sending Less Bits for Free!Sep 4, 2020On Communication Compression for Distributed Optimization on Heterogeneous DataFeb 18, 2020Is Local SGD Better than Minibatch SGD?May 30, 2024Towards Faster Decentralized Stochastic Optimization with Communication CompressionJul 12, 2023Locally Adaptive Federated LearningJun 6, 2025Exploiting Similarity for Computation and Communication-Efficient Decentralized OptimizationOct 18, 2012Variable Metric Random PursuitJun 10, 2020Extrapolation for Large-batch Training in Deep LearningJul 9, 2019Unified Optimal Analysis of the (Stochastic) Gradient MethodSep 11, 2019The Error-Feedback Framework: Better Rates for SGD with Delayed Gradients and Compressed CommunicationNov 10, 2021Linear Speedup in Personalized Collaborative LearningOct 14, 2019SCAFFOLD: Stochastic Controlled Averaging for Federated LearningOct 8, 2021RelaySum for Decentralized Deep Learning on Heterogeneous DataJan 25, 2025Scalable Decentralized Learning with TeleportationMar 5, 2024Non-convex Stochastic Composite Optimization with Polyak MomentumJun 1, 2018Global linear convergence of Newton's method without strong-convexity or Lipschitz gradientsJul 31, 2020On the Convergence of SGD with Biased GradientsFeb 18, 2022Tackling benign nonconvexity with smoothing and stochastic gradients