Showing 1–20 of 23 results
/ Date/ Name
Oct 19, 2024DPVS-Shapley:Faster and Universal Contribution Evaluation Component in Federated LearningDec 18, 2024LLaVA-UHD v2: an MLLM Integrating High-Resolution Semantic Pyramid via Hierarchical Window TransformerApr 11, 2022Semantic Segmentation for Point Cloud Scenes via Dilated Graph Feature Aggregation and Pyramid DecodersAug 13, 2022Bidirectional Feature Globalization for Few-shot Semantic Segmentation of 3D Point Cloud ScenesDec 2, 2025GeoViS: Geospatially Rewarded Visual Search for Remote Sensing Visual GroundingAug 26, 2025EMind: A Foundation Model for Multi-task Electromagnetic Signals UnderstandingMar 9, 2026MERLIN: Building Low-SNR Robust Multimodal LLMs for Electromagnetic SignalsMar 16, 2025Will Pre-Training Ever End? A First Step Toward Next-Generation Foundation MLLMs via Self-Improving Systematic CognitionMar 31, 2025XLRS-Bench: Could Your Multimodal LLMs Understand Extremely Large Ultra-High-Resolution Remote Sensing Imagery?Sep 16, 2025MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training RecipeNov 26, 2025LLaVA-UHD v3: Progressive Visual Compression for Efficient Native-Resolution Encoding in MLLMsMar 27, 2025Video-R1: Reinforcing Video Reasoning in MLLMsMar 13, 2026Cheers: Decoupling Patch Details from Semantic Representations Enables Unified Multimodal Comprehension and GenerationMay 27, 2025GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K ResolutionMar 17, 2025KARL: Knowledge-Aware Reasoning and Reinforcement Learning for Knowledge-Intensive Visual GroundingOct 21, 2025ProCLIP: Progressive Vision-Language Alignment via LLM-based EmbedderDec 29, 2025MM-UAVBench: How Well Do Multimodal Large Language Models See, Think, and Plan in Low-Altitude UAV Scenarios?May 19, 2022Integrally Migrating Pre-trained Transformer Encoder-decoders for Visual Object DetectionOct 6, 2021Long-tailed Distribution AdaptationMar 18, 2024LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images