"au:"Jianwu Dang"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Jianwu Dang"" — arXiv2 Search

Showing 1–20 of 51 results

/ Date/ Name

Oct 23, 2019Relation Modeling with Graph Convolutional Networks for Facial Action Unit Detection Nov 2, 2022Monolingual Recognizers Fusion for Code-switching Speech Recognition Jan 5, 2025Reducing the Gap Between Pretrained Speech Enhancement and Recognition Models Using a Real Speech-Trained Bridging Module Aug 11, 2024VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing Sep 28, 2025LORT: Locally Refined Convolution and Taylor Transformer for Monaural Speech Enhancement Nov 19, 2020Multi-stage Speaker Extraction with Utterance and Frame-Level Reference Signals Apr 17, 2021Exploring Deep Learning for Joint Audio-Visual Lip Biometrics Feb 21, 2022L-SpEx: Localized Target Speaker Extraction Apr 30, 2022Heterogeneous Graph Neural Networks using Self-supervised Reciprocally Contrastive Learning Jul 15, 2022MIMO-DoAnet: Multi-channel Input and Multiple Outputs DoA Network with Unknown Number of Sound Sources Oct 9, 2022VCSE: Time-Domain Visual-Contextual Speaker Extraction Network Aug 31, 2024Progressive Residual Extraction based Pre-training for Speech Representation Learning Sep 29, 2025Word-Level Emotional Expression Control in Zero-Shot Text-to-Speech Synthesis Aug 4, 2025SecoustiCodec: Cross-Modal Aligned Streaming Single-Codecbook Speech Codec Jan 24, 2025Efficient Emotion and Speaker Adaptation in LLM-Based TTS via Characteristic-Specific Partial Fine-Tuning Sep 27, 2024Expressive Prompting: Improving Emotion Intensity and Speaker Consistency in Zero-Shot TTS Feb 16, 2026Breaking Data Efficiency Dilemma: A Federated and Augmented Learning Framework For Alzheimer's Disease Detection via Speech Dec 18, 2023A Refining Underlying Information Framework for Monaural Speech Enhancement Dec 7, 2022MIMO-DBnet: Multi-channel Input and Multiple Outputs DOA-aware Beamforming Network for Speech Separation May 18, 2023Locate and Beamform: Two-dimensional Locating All-neural Beamformer for Multi-channel Speech Separation