"au:"Winson Han"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Winson Han"" — arXiv2 Search

Showing 1–20 of 22 results

/ Date/ Name

Dec 5, 2023SPOC: Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World Oct 10, 2025Synthetic Object Compositions for Scalable and Accurate Learning in Detection, Segmentation, and Grounding Feb 11, 2026MolmoSpaces: A Large-Scale Open Ecosystem for Robot Navigation and Manipulation Dec 17, 2019Learning Generalizable Visual Representations via Interactive Gameplay Apr 22, 2021ManipulaTHOR: A Framework for Visual Object Manipulation Dec 14, 2017AI2-THOR: An Interactive 3D Environment for Visual AI Dec 3, 2019ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks Oct 6, 2025Visual Representations inside the Language Model Mar 18, 2026Unified Spatio-Temporal Token Scoring for Efficient Video VLMs Apr 9, 2026WildDet3D: Scaling Promptable 3D Detection in the Wild Jun 14, 2022ProcTHOR: Large-Scale Embodied AI Using Procedural Generation Mar 17, 2026MolmoB0T: Large-Scale Simulation Enables Zero-Shot Manipulation Apr 14, 2020RoboTHOR: An Open Simulation-to-Real Embodied AI Platform May 19, 2025GraspMolmo: Generalizable Task-Oriented Grasping via Large-Scale Synthetic Data Generation Apr 9, 2026MolmoWeb: Open Visual Web Agent and Open Data for the Open Web Oct 13, 2022Retrospectives on the Embodied AI Workshop Dec 18, 2024The One RING: a Robotic Indoor Navigation Generalist Aug 11, 2025MolmoAct: Action Reasoning Models that can Reason in Space Dec 14, 2023Holodeck: Language Guided Generation of 3D Embodied AI Environments Feb 26, 2026Synthetic Visual Genome 2: Extracting Large-scale Spatio-Temporal Scene Graphs from Videos