"au:"Yali Wang"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Yali Wang"" — arXiv2 Search

Showing 1–7 of 7 results

/ Date/ Name

Feb 15, 2026UniWeTok: An Unified Binary Tokenizer with Codebook Size $\mathit{2^{128}}$ for Unified Multimodal Large Language Model Jun 26, 2024EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation Jun 12, 2024OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text Mar 22, 2024InternVideo2: Scaling Foundation Models for Multimodal Video Understanding Mar 11, 2024VideoMamba: State Space Model for Efficient Video Understanding Dec 6, 2022InternVideo: General Video Foundation Models via Generative and Discriminative Learning Nov 24, 2021MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning