Showing 1–20 of 93 results
/ Date/ Name
Jan 13, 2021Should Ensemble Members Be Calibrated?Oct 27, 2022Towards High-Quality Neural TTS for Low-Resource Languages by Learning Compact Speech RepresentationsMar 31, 2022Neural Architecture Search for Speech Emotion RecognitionMar 2, 2022A Multi-Scale Time-Frequency Spectrogram Discriminator for GAN-based Non-Autoregressive TTSSep 22, 2022A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTSMar 14, 2023Leveraging Pretrained Representations with Task-related Keywords for Alzheimer's Disease DetectionMar 14, 2023A Hierarchical Regression Chain Framework for Affective Vocal Burst RecognitionNov 3, 2020Learning Explicit Prosody Models and Deep Speaker Embeddings for Atypical Voice ConversionFeb 4, 2022The CUHK-TENCENT speaker diarization system for the ICASSP 2022 multi-channel multi-party meeting transcription challengeJun 18, 2022Tackling Spoofing-Aware Speaker Verification with Multi-Model FusionAug 31, 2023QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation LearningMay 24, 2023SAIL: Search-Augmented Instruction LearningMay 25, 2023Unified Modeling of Multi-Talker Overlapped Speech Recognition and Diarization with a Sidecar SeparatorFeb 2, 2023Improving Rare Words Recognition through Homophone Extension and Unified Writing for Low-resource Cantonese Speech RecognitionJun 4, 2024SimpleSpeech: Towards Simple and Efficient Text-to-Speech with Scalar Latent Transformer Diffusion ModelsJun 5, 2024Addressing Index Collapse of Large-Codebook Speech Tokenizer with Dual-Decoding Product-Quantized Variational Auto-EncoderAug 29, 2023Rethinking Machine Ethics -- Can LLMs Perform Moral Reasoning through the Lens of Moral Theories?Jan 2, 2025learning discriminative features from spectrograms using center loss for speech emotion recognitionDec 9, 2024Not All Errors Are Equal: Investigation of Speech Recognition Errors in Alzheimer's Disease DetectionSep 13, 2024Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions