Showing 1–20 of 42 results
/ Date/ Name
May 12, 2020DiscreTalk: Text-to-Speech as a Machine Translation ProblemOct 15, 2021ESPnet2-TTS: Extending the Edge of TTS ResearchApr 14, 2021Non-autoregressive sequence-to-sequence voice conversionJul 28, 2018Back-Translation-Style Data Augmentation for End-to-End ASRApr 22, 2018Multi-Head Decoder for End-to-End Speech RecognitionOct 24, 2019ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech ToolkitDec 17, 2021Discretization and Re-synthesis: an alternative method to solve the Cocktail Party ProblemFeb 17, 2022Acoustic Event Detection with Classifier ChainsJan 22, 2023Unsupervised Data Selection for TTS: Using Arabic Broadcast News as a Case StudyNov 27, 2018Refined WaveNet Vocoder for Variational Autoencoder Based Voice ConversionJul 21, 2019Statistical Voice Conversion with Quasi-Periodic WaveNet VocoderMar 26, 2020Non-parallel Voice Conversion System with WaveNet Vocoder and Collapsed Speech SuppressionAug 7, 2020Pretraining Techniques for Sequence-to-Sequence Voice ConversionSep 13, 2019A Comparative Study on Transformer vs RNN in Speech ApplicationsJun 11, 2021Anomalous Sound Detection Using a Binary Classification Model and Class CentroidsMay 25, 2025Serial-OE: Anomalous sound detection based on serial method with outlier exposure capable of using small amounts of anomalous data for trainingMay 2, 2019Investigation of F0 conditioning and Fully Convolutional Networks in Variational Autoencoder based Voice ConversionOct 26, 2020Recent Developments on ESPnet Toolkit Boosted by ConformerJul 24, 2019Non-Parallel Voice Conversion with Cyclic Variational AutoencoderMay 18, 2020Quasi-Periodic Parallel WaveGAN Vocoder: A Non-autoregressive Pitch-dependent Dilated Convolution Model for Parametric Speech Generation