Showing 381–400 of 2,609 results
/ Date/ Name
Dec 31, 2025TeleWorld: Towards Dynamic Multimodal Synthesis with a 4D World ModelDec 29, 2025Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive DiffusionDec 27, 2025Towards Robust Optical-SAR Object Detection under Missing Modalities: A Dynamic Quality-Aware Fusion FrameworkDec 23, 2025CHAMMI-75: Pre-training multi-channel models with heterogeneous microscopy imagesDec 23, 2025Few-Shot-Based Modular Image-to-Video Adapter for Diffusion ModelsDec 22, 2025Widget2Code: From Visual Widgets to UI Code via Multimodal LLMsDec 22, 2025BabyFlow: 3D modeling of realistic and expressive infant facesDec 21, 2025A Study of Finetuning Video Transformers for Multi-view Geometry TasksDec 18, 2025EasyV2V: A High-quality Instruction-based Video Editing FrameworkDec 18, 2025Smile on the Face, Sadness in the Eyes: Bridging the Emotion Gap with a Multimodal Dataset of Eye and Facial BehaviorsDec 16, 2025Artificial Intelligence for the Assessment of Peritoneal Carcinosis during Diagnostic Laparoscopy for Advanced Ovarian CancerDec 15, 2025Adapting MLLMs for Nuanced Video RetrievalDec 15, 2025Seedance 1.5 pro: A Native Audio-Visual Joint Generation Foundation ModelDec 15, 2025Soul: Breathe Life into Digital Human for High-fidelity Long-term Multimodal AnimationDec 15, 2025Ego-EXTRA: video-language Egocentric Dataset for EXpert-TRAinee assistanceDec 15, 2025Comprehensive Deployment-Oriented Assessment for Cross-Environment Generalization in Deep Learning-Based mmWave Radar SensingDec 12, 2025MatAnyone 2: Scaling Video Matting via a Learned Quality EvaluatorDec 12, 2025The N-Body Problem: Parallel Execution from Single-Person Egocentric VideoDec 11, 2025WorldLens: Full-Spectrum Evaluations of Driving World Models in Real WorldDec 11, 2025Omni-Attribute: Open-vocabulary Attribute Encoder for Visual Concept Personalization