Showing 1–14 of 14 results
/ Date/ Name
Jan 8, 2026VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric ControlFeb 11, 2025VidCRAFT3: Camera, Object, and Lighting Control for Image-to-Video GenerationFeb 20, 2022Clustering by the Probability Distributions from Extreme Value TheoryJul 13, 2024ContextualStory: Consistent Visual Storytelling with Spatially-Enhanced and Storyline ContextMar 23, 2021Incrementally Zero-Shot Detection by an Extreme Value AnalyzerDec 31, 2020Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with TransformersFeb 24, 2024Intelligent Director: An Automatic Framework for Dynamic Visual Composition using ChatGPTMar 4, 2018Could Interaction with Social Robots Facilitate Joint Attention of Children with Autism Spectrum Disorder?Mar 30, 2025ReasonGrounder: LVLM-Guided Hierarchical Feature Splatting for Open-Vocabulary 3D Visual Grounding and ReasoningJun 4, 2021NMS-Loss: Learning with Non-Maximum Suppression for Crowded Pedestrian DetectionJul 19, 2022Vision Transformers: From Semantic Segmentation to Dense PredictionJul 9, 2025A Neural Representation Framework with LLM-Driven Spatial Reasoning for Open-Vocabulary 3D Visual GroundingMar 16, 2026Mind-of-Director: Multi-modal Agent-Driven Film Previsualization via Collaborative Decision-MakingJul 2, 2025TriVLA: A Triple-System-Based Unified Vision-Language-Action Model with Episodic World Modeling for General Robot Control