Showing 1–15 of 15 results
/ Date/ Name
Jan 8, 2024Language-Conditioned Robotic Manipulation with Fast and Slow ThinkingDec 4, 2024Diffusion-VLA: Generalizable and Interpretable Robot Foundation Model via Self-Generated ReasoningSep 30, 2025dVLA: Diffusion Vision-Language-Action Model with Multimodal Chain-of-ThoughtJun 28, 2024MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?Jan 5, 2024Object-Centric Instruction Augmentation for Robotic ManipulationSep 6, 2023Enhancing Asynchronous Time Series Forecasting with Contrastive Relational InferenceJan 4, 2025Fresh-CL: Feature Realignment through Experts on Hypersphere in Continual LearningJan 4, 2024LLaVA-Phi: Efficient Multi-Modal Assistant with Small Language ModelNov 29, 2023SpeechAct: Towards Generating Whole-body Motion from SpeechSep 19, 2024TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic ManipulationMar 10, 2024Mipha: A Comprehensive Overhaul of Multimodal Assistant with Small Language ModelsFeb 20, 2025ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action ModelSep 22, 2024Scaling Diffusion Policy in Transformer to 1 Billion Parameters for Robotic ManipulationDec 29, 2024CoA-VLA: Improving Vision-Language-Action Models via Visual-Textual Chain-of-AffordanceFeb 26, 2025ObjectVLA: End-to-End Open-World Object Manipulation Without Demonstration