Showing 1–20 of 31 results
/ Date/ Name
Apr 8, 2022Transducer-based language embedding for spoken language identificationMar 1, 2021CrossMap Transformer: A Crossmodal Masked Path Transformer Using Double Back-Translation for Vision-and-Language NavigationDec 23, 2019A Multimodal Target-Source Classifier with Attention Branches to Understand Ambiguous Instructions for Fetching Daily ObjectsSep 12, 2017End-to-End Waveform Utterance Enhancement for Direct Evaluation Metrics Optimization by Fully Convolutional Neural NetworksMar 7, 2017Raw Waveform-based Speech Enhancement by Fully Convolutional NetworksFeb 21, 2025Retrieval-Augmented Speech Recognition Approach for Domain ChallengesSep 5, 2025Layer-wise Analysis for Quality of Multilingual Synthesized SpeechJul 29, 2022Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognitionDec 27, 2019Cross-scale Attention Model for Acoustic Event ClassificationJan 9, 2021Coupling a generative model with a discriminative learning framework for speaker verificationJul 9, 2020Alleviating the Burden of Labeling: Sentence Generation by Attention Branch Encoder-Decoder NetworkJan 16, 2018Grounded Language Understanding for Manipulation Instructions Using GAN-Based ClassificationMay 19, 2025Cross-modal Knowledge Transfer Learning as Graph Matching Based on Optimal Transport for ASRSep 6, 2025New Insights into Optimal Alignment of Acoustic and Linguistic Representations for Knowledge Transfer in ASRSep 10, 2019Multimodal Attention Branch Network for Perspective-Free Sentence GenerationJun 17, 2019Understanding Natural Language Instructions for Fetching Daily Objects Using GAN-Based Multimodal Target-Source ClassificationJun 11, 2018A Multimodal Classifier Generative Adversarial Network for Carry and Place Tasks from Ambiguous Language InstructionsApr 22, 2022Speaking-Rate-Controllable HiFi-GAN Using Feature InterpolationDec 24, 2020Unsupervised neural adaptation model based on optimal transport for spoken language identificationSep 28, 2023Hierarchical Cross-Modality Knowledge Transfer with Sinkhorn Attention for CTC-based ASR