"au:"Minjie Zhu"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Minjie Zhu"" — arXiv2 Search

Showing 1–15 of 15 results

/ Date/ Name

Jan 8, 2024Language-Conditioned Robotic Manipulation with Fast and Slow Thinking Dec 4, 2024Diffusion-VLA: Generalizable and Interpretable Robot Foundation Model via Self-Generated Reasoning Sep 30, 2025dVLA: Diffusion Vision-Language-Action Model with Multimodal Chain-of-Thought Jun 28, 2024MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?Jan 5, 2024Object-Centric Instruction Augmentation for Robotic Manipulation Sep 6, 2023Enhancing Asynchronous Time Series Forecasting with Contrastive Relational Inference Jan 4, 2025Fresh-CL: Feature Realignment through Experts on Hypersphere in Continual Learning Jan 4, 2024LLaVA-Phi: Efficient Multi-Modal Assistant with Small Language Model Nov 29, 2023SpeechAct: Towards Generating Whole-body Motion from Speech Sep 19, 2024TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation Mar 10, 2024Mipha: A Comprehensive Overhaul of Multimodal Assistant with Small Language Models Feb 20, 2025ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model Sep 22, 2024Scaling Diffusion Policy in Transformer to 1 Billion Parameters for Robotic Manipulation Dec 29, 2024CoA-VLA: Improving Vision-Language-Action Models via Visual-Textual Chain-of-Affordance Feb 26, 2025ObjectVLA: End-to-End Open-World Object Manipulation Without Demonstration