Showing 1–20 of 28 results
/ Date/ Name
Oct 2, 2025DisCo-Layout: Disentangling and Coordinating Semantic and Physical Refinement in a Multi-Agent Framework for 3D Indoor Layout SynthesisAug 9, 2019Relation-Aware Pyramid Network (RapNet) for temporal action proposalOct 12, 2021Relation-aware Video Reading Comprehension for Temporal Language GroundingMar 9, 2020Accurate Temporal Action Proposal Generation with Relation-Aware Pyramid NetworkNov 24, 2025Rethinking Intermediate Representation for VLM-based Robot ManipulationNov 19, 2024Faster Multi-GPU Training with PPLL: A Pipeline Parallelism Framework Leveraging Local LearningJun 1, 2024Advancing Supervised Local Learning Beyond Classification with Long-term Feature BankApr 15, 2025LayoutCoT: Unleashing the Deep Reasoning Potential of Large Language Models for Layout GenerationFeb 12, 2026PosterOmni: Generalized Artistic Poster Creation via Task Distillation and Unified Reward FeedbackFeb 23, 2026PosterReward: Unlocking Accurate Evaluation for High-Quality Graphic Design GenerationJul 22, 2025MAN++: Scaling Momentum Auxiliary Network for Supervised Local Learning in Vision TasksNov 23, 2025Beyond Words and Pixels: A Benchmark for Implicit World Knowledge Reasoning in Generative ModelsDec 24, 2019Focusing and Diffusion: Bidirectional Attentive Graph Convolutional Networks for Skeleton-based Action RecognitionApr 6, 2023Boundary-Denoising for Video Activity LocalizationJun 2, 2025Reinforcement Learning Tuning for VideoLLMs: Reward Design and Data EfficiencyMar 13, 2025SciVerse: Unveiling the Knowledge Comprehension and Visual Reasoning of LMMs on Multi-modal Scientific ProblemsApr 30, 2026LaST-R1: Reinforcing Robotic Manipulation via Adaptive Physical Latent ReasoningJul 26, 2024From 2D to 3D: AISG-SLA Visual Localization ChallengeAug 6, 2025DOMR: Establishing Cross-View Segmentation via Dense Object MatchingJan 14, 2025LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding