"au:"Qingru Zhang"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Qingru Zhang"" — arXiv2 Search

Showing 1–17 of 17 results

/ Date/ Name

Nov 3, 2023Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs Jun 25, 2022PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance Oct 19, 2023Efficient Long-Range Transformers: You Need to Attend More, but Not Necessarily at Every Layer Sep 16, 2024Model Tells Itself Where to Attend: Faithfulness Meets Automatic Attention Steering Oct 24, 2025Self-Rewarding PPO: Aligning Large Language Models with Demonstrations Only Mar 18, 2023AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning Mar 1, 2021A Biased Graph Neural Network Sampler with Near-Optimal Regret Jun 21, 2024Robust Reinforcement Learning from Corrupted Human Feedback Oct 23, 2025Ask a Strong LLM Judge when Your Reward Model is Uncertain Sep 29, 2018AdaShift: Decorrelation and Convergence of Adaptive Learning Rate Methods Mar 8, 2024GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM Oct 16, 2023Robust Multi-Agent Reinforcement Learning via Adversarial Regularization: Theoretical Foundation and Stable Algorithms Apr 15, 2022MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation May 26, 2025Win Fast or Lose Slow: Balancing Speed and Accuracy in Latency-Sensitive Decisions of LLMs Oct 4, 2022Less is More: Task-aware Layer-wise Distillation for Language Model Compression Jun 20, 2023LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation May 22, 2025Think-RM: Enabling Long-Horizon Reasoning in Generative Reward Models