Efficient Speculative Decoding for Llama at Scale: Challenges and Solutions — arXiv2