Showing 301–320 of 2,609 results
/ Date/ Name
Mar 6, 2026StruVis: Enhancing Reasoning-based Text-to-Image Generation via Thinking with Structured VisionMar 5, 2026FedAFD: Multimodal Federated Learning via Adversarial Fusion and DistillationMar 3, 2026How to Peel with a Knife: Aligning Fine-Grained Manipulation with Human PreferenceMar 3, 2026Conditioned Activation Transport for T2I Safety SteeringMar 3, 2026TC-Padé: Trajectory-Consistent Padé Approximation for Diffusion AccelerationMar 3, 2026ProGIC: Progressive and Lightweight Generative Image Compression with Residual Vector QuantizationMar 3, 2026Evaluating Cross-Modal Reasoning Ability and Problem Characteristics with Multimodal Item Response TheoryMar 1, 2026ASMIL: Attention-Stabilized Multiple Instance Learning for Whole Slide ImagingMar 1, 2026Unified Vision-Language Modeling via Concept Space AlignmentFeb 28, 2026Mesh-Pro: Asynchronous Advantage-guided Ranking Preference Optimization for Artist-style Quadrilateral Mesh GenerationFeb 27, 2026Altitude-Adaptive Vision-Only Geo-Localization for UAVs in GPS-Denied EnvironmentsFeb 27, 2026Diffusion Probe: Generated Image Result Prediction Using CNN ProbesFeb 27, 2026Evaluating Visual Prompts with Eye-Tracking Data for MLLM-Based Human Activity RecognitionFeb 27, 2026Annotation-Free Visual Reasoning for High-Resolution Large Multimodal Models via Reinforcement LearningFeb 26, 2026Synthetic Visual Genome 2: Extracting Large-scale Spatio-Temporal Scene Graphs from VideosFeb 26, 2026MovieTeller: Tool-augmented Movie Synopsis with ID Consistent Progressive AbstractionFeb 26, 2026SkillNet: Create, Evaluate, and Connect AI SkillsFeb 26, 2026Guidance Matters: Rethinking the Evaluation Pitfall for Text-to-Image GenerationFeb 25, 2026When LoRA Betrays: Backdooring Text-to-Image Models by Masquerading as Benign AdaptersFeb 24, 2026RelA-Diffusion: Relativistic Adversarial Diffusion for Multi-Tracer PET Synthesis from Multi-Sequence MRI