Showing 1–14 of 14 results
/ Date/ Name
Jun 7, 2022A Simple and Efficient Pipeline to Build an End-to-End Spatial-Temporal Action DetectorJun 8, 2021Salvage of Supervision in Weakly Supervised Object DetectionMay 10, 2025StableMotion: Repurposing Diffusion-Based Image Priors for Motion EstimationJul 25, 2024Harnessing Temporal Causality for Advanced Temporal Action DetectionMar 9, 2025TimeLoc: A Unified End-to-End Framework for Precise Timestamp Localization in Long VideosApr 10, 2025Kimi-VL Technical ReportMay 29, 2025VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?Jul 5, 2023NMS Threshold matters for Ego4D Moment Queries -- 2nd place solution to the Ego4D Moment Queries Challenge 2023Jan 23, 2026Affinity Contrastive Learning for Skeleton-based Human Activity UnderstandingJan 27, 2026Towards Pixel-Level VLM Perception via Simple Points PredictionFeb 2, 2026Kimi K2.5: Visual Agentic IntelligenceJul 28, 2025Kimi K2: Open Agentic IntelligenceJan 28, 2026WorldVQA: Measuring Atomic World Knowledge in Multimodal Large Language ModelsMar 16, 2026Attention Residuals