Showing 1–18 of 18 results
/ Date/ Name
Jun 8, 2021Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech RecognitionMar 2, 2021Tune-In: Training Under Negative Environments with Interference for Attention Networks Simulating Cocktail Party EffectMar 25, 2022BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech SynthesisAug 26, 2021Bilateral Denoising Diffusion ModelsMay 25, 2023Efficient Neural Music GenerationApr 21, 2022FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech SynthesisMar 1, 2021Contrastive Separative Coding for Self-supervised Representation LearningJan 13, 2021Effective Low-Cost Time-Domain Audio Separation Using Globally Attentive Locally Recurrent NetworksMar 1, 2021Sandglasset: A Light Multi-Granularity Self-attentive Network For Time-Domain Speech SeparationMar 25, 2025Analyzable Chain-of-Musical-Thought Prompting for High-Fidelity Music GenerationJul 17, 2025Apple Intelligence Foundation Language Models: Tech Report 2025Jun 18, 2017Gradient Diversity: a Key Ingredient for Scalable Distributed LearningMar 10, 2020Benchmarking TinyML Systems: Challenges and DirectionFeb 25, 2026SkyReels-V4: Multi-modal Video-Audio Generation, Inpainting and Editing modelOct 28, 2019Mixup-breakdown: a consistency training method for improving generalization of speech separation modelsMay 26, 2023Diverse and Expressive Speech Prosody Prediction with Denoising Diffusion Probabilistic ModelAug 26, 2024Foundation Models for Music: A SurveySep 9, 2024SongCreator: Lyrics-based Universal Song Generation