"au:"Ruihao Gong"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Ruihao Gong"" — arXiv2 Search

Showing 1–20 of 20 results

/ Date/ Name

Nov 19, 2025MoDES: Accelerating Mixture-of-Experts Multimodal Large Language Models via Dynamic Expert Skipping Oct 31, 2025Phased DMD: Few-step Distribution Matching Distillation via Score Matching within Subintervals Aug 13, 2025LLMC+: Benchmarking Vision-Language Model Compression with a Plug-and-play Toolkit Jun 4, 2025Pre$^3$: Enabling Deterministic Pushdown Automata for Faster Structured LLM Generation May 16, 2025QVGen: Pushing the Limit of Quantized Video Generative Models Jul 30, 2024OmniBal: Towards Fast Instruction-Tuning for Vision-Language Models via Omniverse Computation Balance Mar 11, 20242023 Low-Power Computer Vision Challenge (LPCVC) Summary Nov 27, 2023TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models Oct 20, 2023Exploring the Potential of Flexible 8-bit Format: Design and Algorithm Aug 8, 2023Lossy and Lossless (L$^2$) Post-training Model Size Compression Jul 1, 2023SysNoise: Exploring and Benchmarking Training-Deployment System Inconsistency Apr 18, 2023Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling Sep 28, 2022Exploring the Relationship between Architecture and Adversarially Robust Generalization Nov 5, 2021MQBench: Towards Reproducible and Deployable Model Quantization Benchmark Sep 2, 2021Real World Robustness from Systematic Noise Jun 13, 2021A Free Lunch From ANN: Towards Efficient, Accurate Spiking Neural Networks Calibration Feb 10, 2021BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction Oct 9, 2020Once Quantization-Aware Training: High Performance Extremely Low-bit Architecture Search Dec 29, 2019Towards Unified INT8 Training for Convolutional Neural Network Aug 14, 2019Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks