Showing 1–19 of 19 results
/ Date/ Name
Oct 28, 2022On the Use of Modality-Specific Large-Scale Pre-Trained Encoders for Multimodal Sentiment AnalysisJun 27, 2024Factor-Conditioned Speaking-Style CaptioningAug 31, 2023Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness TradeoffFeb 11, 2024Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech SynthesisJul 12, 2025Can We Really Repurpose Multi-Speaker ASR Corpus for Speaker Diarization?Sep 22, 2023NTT speaker diarization system for CHiME-7: multi-domain, multi-microphone End-to-end and vector clustering diarizationJul 11, 2022Speaker consistency loss and step-wise optimization for semi-supervised joint training of TTS and ASR using unpaired text dataMar 29, 2019Does the Lombard Effect Improve Emotional Communication in Noise? - Analysis of Emotional Speech Acted in Noise -Jun 4, 2023End-to-End Joint Target and Non-Target Speakers ASRAug 30, 2024Recursive Attentive Pooling for Extracting Speaker Embeddings from Multi-Speaker RecordingsJul 1, 2024SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space ModelingMay 30, 2025Pretraining Multi-Speaker Identification for Neural Speaker DiarizationJun 13, 2025Dissecting the Segmentation Model of End-to-End Diarization with Vector ClusteringJun 14, 2025Mitigating Non-Target Speaker Bias in Guided Speaker EmbeddingJul 28, 2018Ultrafast Dynamics of Electron-phonon Coupling in Transition-metal DichalcogenidesFeb 14, 2025Microphone Array Geometry Independent Multi-Talker Distant ASR: NTT System for the DASR Task of the CHiME-8 ChallengeSep 9, 2024NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 ChallengeOct 9, 2024Mamba-based Segmentation Model for Speaker DiarizationOct 16, 2024Guided Speaker Embedding