Showing 21–40 of 150 results
/ Date/ Name
Mar 26, 2024DiffFAE: Advancing High-fidelity One-shot Facial Appearance Editing with Space-sensitive Customization and Semantic PreservationJun 4, 2024M3DM-NR: RGB-3D Noisy-Resistant Industrial Anomaly Detection via Multimodal DenoisingMar 1, 2024Point Cloud Mamba: Point Cloud Learning via State Space ModelJun 17, 2024CustAny: Customizing Anything from A Single ExampleAug 1, 2023PVG: Progressive Vision Graph for Vision RecognitionMar 10, 2023Iterative Few-shot Semantic Segmentation from Image Label TextMar 14, 2023Calibrated Teacher for Sparsely Annotated Object DetectionMay 24, 2024PointRWKV: Efficient RWKV-Like Model for Hierarchical Point Cloud LearningAug 6, 2024MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture GenerationJul 7, 2025Semantic Frame InterpolationJun 9, 2025PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and EnhancementOct 21, 2025UltraGen: High-Resolution Video Generation with Hierarchical AttentionNov 25, 2025Boosting Reasoning in Large Multimodal Models via Activation ReplayDec 25, 2025UltraLBM-UNet: Ultralight Bidirectional Mamba-based Model for Skin Lesion SegmentationDec 4, 2024DynamicControl: Adaptive Condition Selection for Improved Text-to-Image GenerationJul 2, 2025Reasoning to Edit: Hypothetical Instruction-Based Image Editing with Visual ReasoningMay 24, 2025So-Fake: Benchmarking and Explaining Social Media Image Forgery DetectionJun 16, 2025AdaVideoRAG: Omni-Contextual Adaptive Retrieval-Augmented Efficient Long Video UnderstandingJan 9, 2026Towards Generalized Multi-Image Editing for Unified Multimodal ModelsJan 31, 2026Dual Latent Memory for Visual Multi-agent System