arXiv2
Search
Toggle theme
/ Date
/ Name
Search
/ Date
/ Name
"au:"Nikhil Bhendawade"" — arXiv2 Search
Showing 1–9 of 9 results
/ Date
/ Name
Oct 15, 2025
Mirror Speculative Decoding: Breaking the Serial Barrier in LLM Inference
Feb 16, 2024
Speculative Streaming: Fast LLM Inference without Auxiliary Models
Feb 4, 2025
M2R2: Mixture of Multi-Rate Residuals for Efficient Transformer Inference
May 11, 2021
EL-Attention: Memory Efficient Lossless Attention for Generation
Feb 25, 2026
The Design Space of Tri-Modal Masked Diffusion Models
Jun 8, 2021
FastSeq: Make Sequence Generation Faster
Jul 29, 2024
Apple Intelligence Foundation Language Models
Jul 17, 2025
Apple Intelligence Foundation Language Models: Tech Report 2025
Sep 24, 2025
FS-DFM: Fast and Accurate Long Text Generation with Few-Step Diffusion Language Models