Showing 41–60 of 92 results
/ Date/ Name
Feb 15, 2024MuChin: A Chinese Colloquial Description Benchmark for Evaluating Language Models in the Field of MusicFeb 6, 2024Sentiment-enhanced Graph-based Sarcasm Explanation in DialogueDec 26, 2023Understanding Before Recommendation: Semantic Aspect-Aware Review Exploitation via Large Language ModelsDec 15, 2023MORE: A Multimodal Object-Entity Relation Extraction Dataset with a Benchmark EvaluationOct 28, 2023Audio-Visual Instance SegmentationOct 16, 2023Evading Detection Actively: Toward Anti-Forensics against Forgery LocalizationSep 19, 2023MelodyGLM: Multi-task Pre-training for Symbolic Melody GenerationAug 24, 2023Spherical Vision Transformer for 360-degree Video Saliency PredictionJul 27, 2023Sample Less, Learn More: Efficient Action Recognition via Frame Feature RestorationJun 27, 2023You Can Mask More For Extremely Low-Bitrate Image CompressionJun 15, 2023The 2023 Video Similarity Dataset and ChallengeMay 12, 2023MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset for Versatile Wireless SensingMay 4, 2023Noise-Resistant Multimodal Transformer for Emotion RecognitionApr 28, 2023LLaMA-Adapter V2: Parameter-Efficient Visual Instruction ModelApr 23, 2023Experts prefer text but videos help novices: an analysis of the utility of multi-media contentApr 10, 2023ITportrait: Image-Text Coupled 3D Portrait Domain AdaptationMar 14, 2023CAT: Causal Audio Transformer for Audio ClassificationJan 10, 2023From Plate to Prevention: A Dietary Nutrient-aided Platform for Health Promotion in SingaporeNov 18, 2022Speaker Overlap-aware Neural Diarization for Multi-party Meeting AnalysisNov 1, 2022SDMuse: Stochastic Differential Music Editing and Generation via Hybrid Representation