Showing 1–20 of 24 results
/ Date/ Name
Oct 11, 2019Query-by-example on-device keyword spottingOct 10, 2019Orthogonality Constrained Multi-Head Attention For Keyword SpottingApr 13, 2024On Speculative Decoding for Multimodal Large Language ModelsMar 27, 2025An NLP-Driven Approach Using Twitter Data for Tailored K-pop Artist RecommendationsFeb 5, 2026Double-P: Hierarchical Top-P Sparse Attention for Long-Context LLMsJul 11, 2024What to Say and When to Say it: Live Fitness Coaching as a Testbed for Situated InteractionAug 16, 2023Painter: Teaching Auto-regressive Language Models to Draw SketchesJun 6, 2023Deductive Verification of Chain-of-Thought ReasoningApr 2, 2024HyperCLOVA X Technical ReportNov 1, 2023Unleashing the Creative Mind: Language Model As Hierarchical Policy For Improved Exploration on Challenging Problem SolvingOct 24, 2024AdaEDL: Early Draft Stopping for Speculative Decoding of Large Language Models via an Entropy-based Lower Bound on Token Acceptance ProbabilityJan 30, 2026Fast Forward: Accelerating LLM Prefill with Predictive FFN SparsityFeb 21, 2024Recursive Speculative Decoding: Accelerating LLM Inference via Sampling Without ReplacementJun 13, 2024ToSA: Token Selective Attention for Efficient Vision TransformersJun 30, 2023Look, Remember and Reason: Grounded reasoning in videos with language modelsJun 28, 2025VOCABTRIM: Vocabulary Pruning for Efficient Speculative Decoding in LLMsFeb 9, 2026QUOKA: Query-Oriented KV Selection For Efficient LLM PrefillMar 18, 2026Efficient Training-Free Multi-Token Prediction via Embedding-Space ProbingMar 9, 2026ConFu: Contemplate the Future for Better Speculative SamplingFeb 29, 2024Direct Alignment of Draft Model for Speculative Decoding with Chat-Fine-Tuned LLMs