Showing 1–17 of 17 results
/ Date/ Name
Oct 15, 2020Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal GroundingMar 23, 2021PanGEA: The Panoramic Graph Environment Annotation ToolkitFeb 15, 2018Image TransformerMar 17, 2025Levels of Analysis for Large Language ModelsMay 14, 2025An evolutionary perspective on modes of learning in TransformersMay 29, 2019Stay on the Path: Instruction Fidelity in Vision-and-Language NavigationOct 9, 2021Vector-quantized Image Modeling with Improved VQGANAug 9, 2019Transferable Representation Learning in Vision-and-Language NavigationDec 27, 2023Prompt Expansion for Adaptive Text-to-Image GenerationJan 26, 2021On the Evaluation of Vision-and-Language Navigation InstructionsJun 22, 2022Scaling Autoregressive Models for Content-Rich Text-to-Image GenerationApr 30, 2024DOCCI: Descriptions of Connected and Contrasting ImagesOct 31, 2024Understanding the Limits of Vision Language Models Through the Lens of the Binding ProblemOct 6, 2022A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation LearningMay 29, 2023Gaussian Process Probes (GPP) for Uncertainty-Aware ProbingJul 11, 2019General Evaluation for Instruction Conditioned Navigation using Dynamic Time WarpingMay 19, 2018Capturing human category representations by sampling in deep feature spaces