Showing 1–20 of 20 results
/ Date/ Name
Apr 13, 2026Audio Flamingo Next: Next-Generation Open Audio-Language Models for Speech, Sound, and MusicOct 2, 2024Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic DataJun 6, 2024ABEX: Data Augmentation for Low-Resource NLU via Expanding Abstract DescriptionsFeb 3, 2024A Closer Look at the Limitations of Instruction TuningDec 20, 2023FusDom: Combining In-Domain and Out-of-Domain Knowledge for Continuous Self-Supervised LearningDec 20, 2023Stable Distillation: Regularizing Continued Pre-training for Low-Resource Automatic Speech RecognitionOct 24, 2023DALE: Generative Data Augmentation for Low-Resource Legal NLPOct 12, 2023CompA: Addressing the Gap in Compositional Reasoning in Audio-Language ModelsSep 18, 2023RECAP: Retrieval-Augmented Audio CaptioningMay 18, 2023BioAug: Conditional Generation based Data Augmentation for Low-Resource Biomedical NERMar 10, 2023UNFUSED: UNsupervised Finetuning Using SElf supervised DistillationMar 2, 2023CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a Context Synergized Hyperbolic NetworkNov 27, 2022A novel multimodal dynamic fusion network for disfluency detection in spoken utterancesNov 2, 2022data2vec-aqc: Search for the right Teaching Assistant in the Teacher-Student training setupOct 5, 2022CCC-wav2vec 2.0: Clustering aided Cross Contrastive Self-supervised learning of speech representationsMar 31, 2022PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech RepresentationsMar 31, 2022M-MELD: A Multilingual Multi-Party Dataset for Emotion Recognition in ConversationsMar 30, 2022Span Classification with Structured Information for Disfluency Detection in Spoken UtterancesMar 25, 2022DeLoRes: Decorrelating Latent Spaces for Low-Resource Audio Representation LearningOct 14, 2021DeToxy: A Large-Scale Multimodal Dataset for Toxicity Classification in Spoken Utterances