Showing 1–20 of 22 results
/ Date/ Name
Nov 15, 2018Comprehensive evaluation of statistical speech waveform synthesisJun 28, 2022Expressive, Variable, and Controllable Duration Modelling in TTSJun 20, 2023eCat: An End-to-End Model for Multi-Speaker TTS & Many-to-Many Fine-Grained Prosody TransferDec 17, 2020Parallel WaveNet conditioned on VAE latent vectorsDec 12, 2019Singing Synthesis: with a little help from my attentionNov 2, 2020CAMP: a Two-Stage Approach to Modelling Prosody in ContextJun 29, 2021Multi-Scale Spectrogram Modelling for Neural Text-to-SpeechJun 27, 2022CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody TransferFeb 12, 2024BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of dataFeb 13, 2022Distribution augmentation for low-resource expressive text-to-speechDec 11, 2019Voice Conversion for Whispered Speech SynthesisNov 4, 2020Prosodic Representation Learning and Contextual Sampling for Neural Text-to-SpeechNov 15, 2018Towards achieving robust universal neural vocodingJun 14, 2021A learned conditional prior for the VAE acoustic space of a TTS systemJun 29, 2022Simple and Effective Multi-sentence TTS with Expressive and Coherent ProsodyMar 4, 2019Traditional Machine Learning for Pitch DetectionDec 30, 2019Using a Pitch-Synchronous Residual Codebook for Hybrid HMM/Frame Selection Speech SynthesisApr 30, 2020CopyCat: Many-to-Many Fine-Grained Prosody Transfer for Neural Text-to-SpeechJan 19, 2018Proceedings of eNTERFACE 2015 Workshop on Intelligent InterfacesJul 13, 2023Controllable Emphasis with zero data for text-to-speech