arXiv2
Search
Dark
/ Date
/ Name
Aa
W
/ Date
/ Name
"au:"Yunhang Shen"" — arXiv2 Search
Showing 1–5 of 5 results
/ Date
/ Name
Jan 27, 2026
Youtu-VL: Unleashing Visual Potential via Unified Vision-Language Supervision
Oct 17, 2025
FlexiReID: Adaptive Mixture of Expert for Multi-Modal Person Re-Identification
Feb 7, 2025
Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuracy
Nov 1, 2024
Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM
Oct 24, 2023
Woodpecker: Hallucination Correction for Multimodal Large Language Models