"au:"Limin Wang"" — arXiv2 SearchShowing 1–8 of 8 results
/ Date/ Name
Sep 29, 2025Learning Goal-Oriented Vision-and-Language Navigation with Self-Improving Demonstrations at ScaleAug 25, 2025InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and EfficiencyMar 18, 2025Make Your Training Flexible: Towards Deployment-Efficient Video ModelsJun 26, 2024EgoVideo: Exploring Egocentric Foundation Model and Downstream AdaptationJun 12, 2024OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with TextMar 22, 2024InternVideo2: Scaling Foundation Models for Multimodal Video UnderstandingMar 11, 2024VideoMamba: State Space Model for Efficient Video UnderstandingDec 6, 2022InternVideo: General Video Foundation Models via Generative and Discriminative Learning