Showing 1–16 of 16 results
/ Date/ Name
Mar 19, 2026Improving Joint Audio-Video Generation with Cross-Modal Context LearningAug 8, 2022Rethinking Robust Representation Learning Under Fine-grained Noisy FacesJun 17, 2024Exploring the Role of Large Language Models in Prompt Encoding for Diffusion ModelsMar 7, 2024MedFLIP: Medical Vision-and-Language Self-supervised Fast Pre-Training with Masked AutoencoderApr 19, 2024MoVA: Adapting Mixture of Vision Experts to Multimodal ContextNov 29, 2024Pretrained Reversible Generation as Unsupervised Visual Representation LearningMar 18, 2026AR-CoPO: Align Autoregressive Video Generation with Contrastive Policy OptimizationOct 25, 2023Towards Large-scale Masked Face RecognitionApr 17, 2022Target-Relevant Knowledge Preservation for Multi-Source Domain Adaptive Object DetectionMar 28, 2025High-Fidelity Diffusion Face Swapping with ID-Constrained Facial ConditioningDec 12, 2024EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLMDec 15, 2024VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face SwappingApr 15, 2025ADT: Tuning Diffusion Models with Adversarial SupervisionApr 3, 2026Salt: Self-Consistent Distribution Matching with Cache-Aware Training for Fast Video GenerationNov 21, 2025Neighbor GRPO: Contrastive ODE Policy Optimization Aligns Flow ModelsAug 28, 2023Institutional mapping and causal analysis of avalanche vulnerable areas based on multi-source data