"au:"Yinmin Zhang"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Yinmin Zhang"" — arXiv2 Search

Showing 1–19 of 19 results

/ Date/ Name

Dec 12, 2023A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning Oct 18, 2023MaskMA: Towards Zero-Shot Multi-Agent Decision Making with Mask-Based Collaborative Learning Jul 29, 2021Learning Geometry-Guided Depth via Projective Modeling for Monocular 3D Object Detection Nov 29, 2022ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency Oct 9, 2023Towards Fair and Comprehensive Comparisons for Image-Based 3D Object Detection Mar 31, 2025Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model Jan 9, 2026PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning Mar 30, 2021Delving into Localization Errors for Monocular 3D Object Detection Dec 18, 2023Explore 3D Dance Generation via Reward Model from Automatically-Ranked Demonstrations Jul 7, 2025Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning Jan 14, 2026STEP3-VL-10B Technical Report Feb 12, 2026PRIME: A Process-Outcome Alignment Benchmark for Verifiable Reasoning in Mathematics and Engineering Jul 25, 2025Step-3 is Large yet Affordable: Model-system Co-design for Cost-effective Decoding Jul 24, 2023Theoretically Guaranteed Policy Improvement Distilled from Model-Based Planning Aug 15, 2022An Empirical Study of Pseudo-Labeling for Image-based 3D Object Detection Dec 26, 2024Multi-matrix Factorization Attention Feb 6, 2026R-Align: Enhancing Generative Reward Models through Rationale-Centric Meta-Judging Nov 28, 2025Thinking by Doing: Building Efficient World Model Reasoning in LLMs via Multi-turn Interaction Feb 11, 2026Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters