"au:"Rongrong Ji"" — arXiv2 SearchShowing 1–9 of 9 results
/ Date/ Name
Apr 23, 2026Prototype-Based Test-Time Adaptation of Vision-Language ModelsFeb 11, 2026Flow caching for autoregressive video generationNov 4, 2025LTD-Bench: Evaluating Large Language Models by Letting Them DrawOct 17, 2025FlexiReID: Adaptive Mixture of Expert for Multi-Modal Person Re-IdentificationMay 30, 2025Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in SpacesFeb 7, 2025Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context AccuracySep 12, 2020Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation LearningJan 15, 2020Filter Grafting for Deep Neural NetworksNov 17, 2017Action-Attending Graphic Neural Network