"au:"Qianchao Zhu"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Qianchao Zhu"" — arXiv2 Search

Showing 1–6 of 6 results

/ Date/ Name

Jun 17, 2024SampleAttention: Near-Lossless Acceleration of Long Context LLM Inference with Adaptive Structured Sparse Attention Jan 31, 2026PROBE: Co-Balancing Computation and Communication in MoE Inference via Real-Time Predictive Prefetching Apr 28, 2022Mat2Stencil: A Modular Matrix-Based DSL for Explicit and Implicit Matrix-Free PDE Solvers on Structured Grid Sep 30, 2025SlimPack: Fine-Grained Asymmetric Packing for Balanced and Efficient Variable-Length LLM Training May 19, 2025HeteroSpec: Leveraging Contextual Heterogeneity for Efficient Speculative Decoding Sep 26, 2025Zeppelin: Balancing Variable-length Workloads in Data Parallel Large Model Training