"au:"Zhangwei Gao"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Zhangwei Gao"" — arXiv2 Search

Showing 1–13 of 13 results

/ Date/ Name

Oct 21, 2024Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance Nov 15, 2024Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization Aug 25, 2025InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Dec 6, 2024Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Oct 13, 2025InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models Jun 12, 2024OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text Aug 21, 2025Intern-S1: A Scientific Multimodal Foundation Model Jul 22, 2024MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity Oct 26, 2023ControlLLM: Augment Language Models with Tools by Searching on Graphs Mar 13, 2025VisualPRM: An Effective Process Reward Model for Multimodal Reasoning Apr 25, 2024How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites Apr 14, 2025InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Oct 14, 2025MetaCaptioner: Towards Generalist Visual Captioning with Open-source Suites