Showing 121–140 of 223 results
/ Date/ Name
May 8, 2022Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to Store Speaker InformationMay 6, 2022Vocalsound: A Dataset for Improving Human Vocal Sounds RecognitionApr 26, 2022Reformulating Speaker Diarization as Community Detection With Emphasis On Topological StructureApr 25, 2022Parallel Synthesis for Autoregressive Speech GenerationApr 25, 2022Graph Convolutional Network Based Semi-Supervised Learning on Multi-Speaker Meeting DataApr 22, 2022Speaking-Rate-Controllable HiFi-GAN Using Feature InterpolationApr 8, 2022Transducer-based language embedding for spoken language identificationApr 1, 2022Universal Adaptor: Converting Mel-Spectrograms Between Different Configurations for Speech SynthesisMar 31, 2022PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech RepresentationsMar 30, 2022Span Classification with Structured Information for Disfluency Detection in Spoken UtterancesMar 29, 2022Integrating Lattice-Free MMI into End-to-End Speech RecognitionMar 28, 2022On-the-Fly Feature Based Rapid Speaker Adaptation for Dysarthric and Elderly Speech RecognitionMar 25, 2022DeLoRes: Decorrelating Latent Spaces for Low-Resource Audio Representation LearningMar 25, 2022Automatic Song Translation for Tonal LanguagesMar 13, 2022CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio ClassificationFeb 15, 2022General-purpose, long-context autoregressive modeling with Perceiver ARFeb 8, 2022Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand ChallengeJan 6, 2022Improving Mandarin End-to-End Speech Recognition with Word N-gram Language ModelNov 29, 2021Mixed Precision DNN Qunatization for Overlapped Speech Separation and RecognitionNov 28, 2021Speaker Embedding-aware Neural Diarization for Flexible Number of Speakers with Textual Information