Showing 1–15 of 15 results
/ Date/ Name
Sep 4, 2025Say More with Less: Variable-Frame-Rate Speech Tokenization via Adaptive Clustering and Implicit Duration CodingMay 24, 2023Incorporating Ultrasound Tongue Images for Audio-Visual Speech Enhancement through Knowledge DistillationSep 19, 2023Incorporating Ultrasound Tongue Images for Audio-Visual Speech EnhancementApr 12, 2023Speech Reconstruction from Silent Tongue and Lip Articulation By Pseudo Target Generation and Domain Adversarial TrainingOct 16, 2024ERVQ: Enhanced Residual Vector Quantization with Intra-and-Inter-Codebook Optimization for Neural Audio CodecsSep 23, 2025Enhancing Noise Robustness for Neural Speech Codecs through Resource-Efficient Progressive Quantization Perturbation SimulationMay 29, 2025Vision-Integrated High-Quality Neural Speech CodingApr 9, 2025A Streamable Neural Audio Codec with Residual Scalar-Vector Quantization for Real-Time CommunicationJan 19, 2026CodeSep: Low-Bitrate Codec-Driven Speech Separation with Base-Token Disentanglement and Auxiliary-Token Serial PredictionNov 1, 2024MDCTCodec: A Lightweight MDCT-based Neural Audio Codec towards High Sampling Rate and Low Bitrate ScenariosSep 28, 2025Understanding Textual Capability Degradation in Speech LLMs via Parameter Importance AnalysisOct 11, 2025Universal Discrete-Domain Speech EnhancementOct 30, 2024APCodec+: A Spectrum-Coding-Based High-Fidelity and High-Compression-Rate Neural Audio Codec with Staged Training ParadigmOct 7, 2024Stage-Wise and Prior-Aware Neural Speech Phase PredictionAug 11, 2025Is GAN Necessary for Mel-Spectrogram-based Neural Vocoder?