"au:"Hung-yi Lee"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Hung-yi Lee"" — arXiv2 Search

Showing 1–20 of 24 results

/ Date/ Name

Apr 12, 2026CodaRAG: Connecting the Dots with Associativity Inspired by Complementary Learning Apr 6, 2026Joint Fullband-Subband Modeling for High-Resolution SingFake Detection Nov 11, 2024Building a Taiwanese Mandarin Spoken Language Model: A First Attempt Sep 13, 2024DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset Jun 7, 2024Neural Codec-based Adversarial Sample Detection for Speaker Verification Feb 20, 2024EMO-SUPERB: An In-depth Look at Speech Emotion Recognition Oct 4, 2023Prompting and Adapter Tuning for Self-supervised Encoder-Decoder Speech Model Sep 29, 2023Low-Resource Self-Supervised Learning with SSL-Enhanced TTS Oct 27, 2022Multimodal Transformer Distillation for Audio-Visual Synchronization Oct 3, 2022Push-Pull: Characterizing the Adversarial Robustness for Audio-Visual Active Speaker Detection May 8, 2022Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to Store Speaker Information Apr 25, 2022Parallel Synthesis for Autoregressive Speech Generation Apr 1, 2022Universal Adaptor: Converting Mel-Spectrograms Between Different Configurations for Speech Synthesis Mar 6, 2021Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-Speech May 15, 2020WG-WaveNet: Real-Time High-Fidelity Speech Synthesis without GPU Dec 5, 2019Towards Robust Neural Vocoding for Speech Generation: A Survey May 28, 2019Unsupervised End-to-End Learning of Discrete Linguistic Units for Voice Conversion Oct 30, 2018Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data Jul 21, 2018Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval Mar 29, 2018Towards Unsupervised Automatic Speech Recognition Trained by Unaligned Speech and Text only