Showing 1–20 of 22 results
/ Date/ Name
Dec 5, 2023SPOC: Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real WorldOct 10, 2025Synthetic Object Compositions for Scalable and Accurate Learning in Detection, Segmentation, and GroundingFeb 11, 2026MolmoSpaces: A Large-Scale Open Ecosystem for Robot Navigation and ManipulationDec 17, 2019Learning Generalizable Visual Representations via Interactive GameplayApr 22, 2021ManipulaTHOR: A Framework for Visual Object ManipulationDec 14, 2017AI2-THOR: An Interactive 3D Environment for Visual AIDec 3, 2019ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday TasksOct 6, 2025Visual Representations inside the Language ModelMar 18, 2026Unified Spatio-Temporal Token Scoring for Efficient Video VLMsApr 9, 2026WildDet3D: Scaling Promptable 3D Detection in the WildJun 14, 2022ProcTHOR: Large-Scale Embodied AI Using Procedural GenerationMar 17, 2026MolmoB0T: Large-Scale Simulation Enables Zero-Shot ManipulationApr 14, 2020RoboTHOR: An Open Simulation-to-Real Embodied AI PlatformMay 19, 2025GraspMolmo: Generalizable Task-Oriented Grasping via Large-Scale Synthetic Data GenerationApr 9, 2026MolmoWeb: Open Visual Web Agent and Open Data for the Open WebOct 13, 2022Retrospectives on the Embodied AI WorkshopDec 18, 2024The One RING: a Robotic Indoor Navigation GeneralistAug 11, 2025MolmoAct: Action Reasoning Models that can Reason in SpaceDec 14, 2023Holodeck: Language Guided Generation of 3D Embodied AI EnvironmentsFeb 26, 2026Synthetic Visual Genome 2: Extracting Large-scale Spatio-Temporal Scene Graphs from Videos