Showing 1–20 of 24 results
/ Date/ Name
May 27, 2024Does Diffusion Beat GAN in Image Super Resolution?Apr 9, 2026KV Cache Offloading for Context-Intensive TasksSep 27, 2025Bridging the Gap Between Promise and Performance for Microscaling FP4 QuantizationOct 14, 2022CAP: Correlation-Aware Pruning for Highly-Accurate Sparse Vision ModelsMar 25, 2023Vision Models Can Be Efficiently Specialized via Few-Shot Task-Aware CompressionAug 30, 2024The Iterative Optimal Brain Surgeon: Faster Sparse Recovery by Leveraging Second-Order InformationDec 21, 2024Label Privacy in Split Learning for Large Models with Parameter-Efficient TrainingAug 31, 2024Accurate Compression of Text-to-Image Diffusion Models via Vector QuantizationApr 8, 2024YaART: Yet Another ART Rendering TechnologyDec 11, 2025Asynchronous Reasoning: Training-Free Interactive Thinking LLMsAug 3, 2023Accurate Neural Network Pruning Requires Rethinking Sparse OptimizationFeb 22, 2023A critical look at the evaluation of GNNs under heterophily: Are we really making progress?Jun 22, 2022A view of mini-batch SGD via generating functions: conditions of convergence, phase transitions, benefit from negative momentaMay 23, 2024PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM CompressionDec 2, 2024Switti: Designing Scale-Wise Transformers for Text-to-Image SynthesisOct 18, 2024EvoPress: Accurate Dynamic Model Compression via Evolutionary SearchJan 11, 2024Extreme Compression of Large Language Models via Additive QuantizationMar 20, 2025Scale-wise Distillation of Diffusion ModelsSep 13, 2022Characterizing Graph Datasets for Node Classification: Homophily-Heterophily Dichotomy and BeyondJun 5, 2023SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression