Showing 1–20 of 21 results
/ Date/ Name
May 17, 2025Mixture of Decoding: An Attention-Inspired Adaptive Decoding Strategy to Mitigate Hallucinations in Large Vision-Language ModelsOct 12, 2025AVoCaDO: An Audiovisual Video Captioner Driven by Temporal OrchestrationJun 10, 2025VidBridge-R1: Bridging QA and Captioning for RL-based Video Understanding Models with Intermediate Proxy TasksFeb 18, 2025VidCapBench: A Comprehensive Benchmark of Video Captioning for Controllable Text-to-Video GenerationJan 27, 2026DiaDem: Advancing Dialogue Descriptions in Audiovisual Video Captioning for Multimodal Large Language ModelsJan 30, 2026ShotFinder: Imagination-Driven Open-Domain Video Shot Retrieval via Web SearchFeb 2, 2026Research on World Models Is Not Merely Injecting World Knowledge into Specific TasksDec 9, 2025The Unseen Bias: How Norm Discrepancy in Pre-Norm MLLMs Leads to Visual Information LossFeb 9, 2026TimeChat-Captioner: Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual CaptionsAug 30, 2022Spacecraft depth completion based on the gray image and the sparse depth mapApr 14, 2025Mavors: Multi-granularity Video Representation for Multimodal Large Language ModelFeb 10, 2026Beyond Closed-Pool Video Retrieval: A Benchmark and Agent Framework for Real-World Video Search and Moment LocalizationDec 13, 2023Curriculum-Enhanced Residual Soft An-Isotropic Normalization for Over-smoothness in Deep GNNsMay 27, 2025MME-VideoOCR: Evaluating OCR-Based Capabilities of Multimodal LLMs in Video ScenariosSep 15, 2025D$^2$HScore: Reasoning-Aware Hallucination Detection via Semantic Breadth and Depth Analysis in LLMsFeb 4, 2026OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language ModelsSep 29, 2025RealUnify: Do Unified Models Truly Benefit from Unification? A Comprehensive BenchmarkJan 17, 2025Attention-guided Self-reflection for Zero-shot Hallucination Detection in Large Language ModelsDec 17, 2025GRAN-TED: Generating Robust, Aligned, and Nuanced Text Embedding for Diffusion ModelsDec 10, 2025VABench: A Comprehensive Benchmark for Audio-Video Generation