Showing 1–16 of 16 results
/ Date/ Name
Apr 6, 2024Joint Visual and Text Prompting for Improved Object-Centric Perception with Multimodal Large Language ModelsOct 9, 2025Hulu-Med: A Transparent Generalist Model towards Holistic Medical Vision-Language UnderstandingAug 26, 2025Knowing or Guessing? Robust Medical Visual Question Answering via Joint Consistency and Contrastive LearningJun 15, 2025CAPO: Reinforcing Consistent Reasoning in Medical Decision-MakingJun 1, 2025Fast or Slow? Integrating Fast Intuition and Deliberate Thinking for Enhancing Visual Question AnsweringApr 16, 2024Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language ModelsApr 20, 2025OmniV-Med: Scaling Medical Vision-Language Model for Universal Visual UnderstandingJun 1, 2025HSCR: Hierarchical Self-Contrastive Rewarding for Aligning Medical Vision Language ModelsOct 20, 2024Modality-Fair Preference Optimization for Trustworthy MLLM AlignmentMar 18, 2026Learning Transferable Temporal Primitives for Video Reasoning via Synthetic VideosMar 11, 2026CodePercept: Code-Grounded Visual STEM Perception for MLLMsDec 2, 2025Beyond N-grams: A Hierarchical Reward Learning Framework for Clinically-Aware Medical Report GenerationMar 4, 2026From Narrow to Panoramic Vision: Attention-Guided Cold-Start Reshapes Multimodal ReasoningAug 14, 2025Med-GLIP: Advancing Medical Language-Image Pre-training with Large-scale Grounded DatasetApr 21, 2026How Far Are Video Models from True Multimodal Reasoning?Jan 8, 2025Unlocking Multimodal Mathematical Reasoning via Process Reward Model