arXiv2
Search
Dark
/ Date
/ Name
Aa
W
/ Date
/ Name
"au:"Qianchao Zhu"" — arXiv2 Search
Showing 1–6 of 6 results
/ Date
/ Name
Jun 17, 2024
SampleAttention: Near-Lossless Acceleration of Long Context LLM Inference with Adaptive Structured Sparse Attention
Jan 31, 2026
PROBE: Co-Balancing Computation and Communication in MoE Inference via Real-Time Predictive Prefetching
Apr 28, 2022
Mat2Stencil: A Modular Matrix-Based DSL for Explicit and Implicit Matrix-Free PDE Solvers on Structured Grid
Sep 30, 2025
SlimPack: Fine-Grained Asymmetric Packing for Balanced and Efficient Variable-Length LLM Training
May 19, 2025
HeteroSpec: Leveraging Contextual Heterogeneity for Efficient Speculative Decoding
Sep 26, 2025
Zeppelin: Balancing Variable-length Workloads in Data Parallel Large Model Training