Showing 1–20 of 41 results
/ Date/ Name
May 22, 2025PCMamba: Physics-Informed Cross-Modal State Space Model for Dual-Camera Compressive Hyperspectral ImagingMay 21, 2025FRN: Fractal-Based Recursive Spectral Reconstruction NetworkSep 17, 2025Process-Supervised Reinforcement Learning for Interactive Multimodal Tool-Use AgentsNov 19, 2020Multi-stage Speaker Extraction with Utterance and Frame-Level Reference SignalsFeb 21, 2022L-SpEx: Localized Target Speaker ExtractionJul 15, 2022MIMO-DoAnet: Multi-channel Input and Multiple Outputs DoA Network with Unknown Number of Sound SourcesOct 9, 2022VCSE: Time-Domain Visual-Contextual Speaker Extraction NetworkMar 9, 2024sVAD: A Robust, Low-Power, and Light-Weight Voice Activity Detection with Spiking Neural NetworksJan 5, 2024Gradient weighting for speaker verification in extremely low Signal-to-Noise RatioSep 15, 2023Audio-Visual Active Speaker Extraction for Sparsely Overlapped Multi-talker SpeechAug 31, 2024Progressive Residual Extraction based Pre-training for Speech Representation LearningDec 22, 2024Time-Graph Frequency Representation with Singular Value Decomposition for Neural Speech EnhancementDec 26, 2023The NUS-HLT System for ICASSP2024 ICMC-ASR Grand ChallengeSep 19, 2023USED: Universal Speaker Extraction and DiarizationSep 29, 2025Word-Level Emotional Expression Control in Zero-Shot Text-to-Speech SynthesisJan 24, 2025Efficient Emotion and Speaker Adaptation in LLM-Based TTS via Characteristic-Specific Partial Fine-TuningSep 13, 2023PIAVE: A Pose-Invariant Audio-Visual Speaker Extraction NetworkMar 31, 2022A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker ExtractionSep 24, 2024WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker ExtractionJan 5, 2025Reducing the Gap Between Pretrained Speech Enhancement and Recognition Models Using a Real Speech-Trained Bridging Module