Showing 1–13 of 13 results
/ Date/ Name
Oct 21, 2024Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% PerformanceNov 15, 2024Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference OptimizationAug 25, 2025InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and EfficiencyDec 6, 2024Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time ScalingOct 13, 2025InternSVG: Towards Unified SVG Tasks with Multimodal Large Language ModelsJun 12, 2024OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with TextAug 21, 2025Intern-S1: A Scientific Multimodal Foundation ModelJul 22, 2024MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive DiversityOct 26, 2023ControlLLM: Augment Language Models with Tools by Searching on GraphsMar 13, 2025VisualPRM: An Effective Process Reward Model for Multimodal ReasoningApr 25, 2024How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source SuitesApr 14, 2025InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal ModelsOct 14, 2025MetaCaptioner: Towards Generalist Visual Captioning with Open-source Suites