Showing 1–20 of 25 results
/ Date/ Name
Sep 20, 2018Recent progress in semantic image segmentationAug 4, 2023Improving Scene Graph Generation with Superpixel-Based Interaction LearningAug 8, 20233D-VisTA: Pre-trained Transformer for 3D Vision and Text AlignmentMay 19, 2024Unifying 3D Vision-Language Understanding via Promptable QueriesAug 26, 2024Video-CCAM: Enhancing Video-Language Understanding with Causal Cross-Attention Masks for Short and Long VideosApr 1, 2023TalkCLIP: Talking Head Generation with Text-Guided Expressive Speaking StylesApr 5, 2024Context-Aware Aerial Object Detection: Leveraging Inter-Object and Background RelationshipsJul 31, 2018SegStereo: Exploiting Semantic Information for Disparity EstimationMar 8, 2022DuMLP-Pin: A Dual-MLP-dot-product Permutation-invariant Network for Set Feature ExtractionNov 27, 2018Fast Object Detection in Compressed VideoAug 29, 2024LLaVA-SG: Leveraging Scene Graphs as Visual Semantic Expression in Vision-Language ModelsMay 27, 2025Exploring Timeline Control for Facial Motion GenerationFeb 22, 2021Phase Space Reconstruction Network for Lane Intrusion Action RecognitionMay 15, 2023Cross-Modality Time-Variant Relation Learning for Generating Dynamic Scene GraphsDec 15, 2023DreamTalk: When Emotional Talking Head Generation Meets Diffusion Probabilistic ModelsFeb 5, 2026FUTURE-VLA: Forecasting Unified Trajectories Under Real-time ExecutionJan 3, 2023StyleTalk: One-shot Talking Head Generation with Controllable Speaking StylesFeb 19, 2021A Deep Graph Wavelet Convolutional Neural Network for Semi-supervised Node ClassificationSep 14, 2024StyleTalk++: A Unified Framework for Controlling the Speaking Styles of Talking HeadsJul 5, 2025Move to Understand a 3D Scene: Bridging Visual Grounding and Exploration for Efficient and Versatile Embodied Navigation