Showing 1–12 of 12 results
/ Date/ Name
Dec 18, 2025Smile on the Face, Sadness in the Eyes: Bridging the Emotion Gap with a Multimodal Dataset of Eye and Facial BehaviorsAug 25, 2025InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and EfficiencyJul 19, 2025Docopilot: Improving Multimodal Models for Document-Level UnderstandingJul 7, 2025Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic CapabilitiesMay 30, 2025Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in SpacesJan 17, 2025FLORA: Formal Language Model Enables Robust Training-free Zero-shot Object Referring AnalysisNov 8, 2024Smile upon the Face but Sadness in the Eyes: Emotion Recognition based on Facial Expressions and Eye BehaviorsJun 12, 2024OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with TextApr 26, 2024Open-Set Video-based Facial Expression Recognition with Human Expression-sensitive PromptingMay 18, 2023VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric TasksMay 4, 2023Noise-Resistant Multimodal Transformer for Emotion RecognitionJun 23, 2022CLAMP: Prompt-based Contrastive Learning for Connecting Language and Animal Pose