"au:"Nima Mesgarani"" — arXiv2 SearchShowing 1–9 of 9 results
/ Date/ Name
Jul 20, 2025DMOSpeech 2: Reinforcement Learning for Duration Prediction in Metric-Optimized Speech SynthesisSep 16, 2024StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style DiffusionAug 13, 2024Style-Talker: Finetuning Audio Language Model and Style-Based Text-to-Speech Model for Fast Spoken Dialogue GenerationJul 18, 2023SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANsJun 13, 2023StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language ModelsJan 20, 2023Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme PredictionsDec 29, 2022StyleTTS-VC: One-Shot Voice Conversion by Knowledge Transfer from Style-Based TTS ModelsMay 30, 2022StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech SynthesisJul 21, 2021StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion