"au:"Mike Seltzer"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Mike Seltzer"" — arXiv2 Search

Showing 1–15 of 15 results

/ Date/ Name

Apr 2, 2024Effective internal language model training and fusion for factorized transducer model Jul 11, 2025SemAlignVC: Enhancing zero-shot timbre conversion using semantic alignment Oct 8, 2025Can Speech LLMs Think while Listening?Oct 27, 2024Get Large Language Models Ready to Speak: A Late-fusion Approach for Speech Generation Oct 7, 2021Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution Dec 21, 2024Transducer-Llama: Integrating LLMs into Streamable Transducer-based Speech Recognition Feb 6, 2026Scaling Speech Tokenizers with Diffusion Autoencoders Jul 21, 2023Prompting Large Language Models with Speech Recognition Abilities Nov 12, 2023AudioChatLlama: Towards General-Purpose Speech Abilities for LLMs Oct 25, 2022Dynamic Speech Endpoint Detection with Regression Targets May 21, 2023Multi-Head State Space Model for Speech Recognition Sep 5, 2023TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models Oct 7, 2021Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study Oct 21, 2020Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition Jul 9, 2021On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models