arXiv2
Search
Dark
/ Date
/ Name
Aa
W
/ Date
/ Name
"au:"Ranjay Krishna"" — arXiv2 Search
Showing 1–7 of 7 results
/ Date
/ Name
Feb 26, 2026
Synthetic Visual Genome 2: Extracting Large-scale Spatio-Temporal Scene Graphs from Videos
Dec 11, 2025
Mull-Tokens: Modality-Agnostic Latent Thinking
Dec 10, 2024
SAT: Dynamic Spatial Aptitude Training for Multimodal Language Models
Dec 5, 2024
NVILA: Efficient Frontier Visual Language Models
Sep 26, 2024
The Hard Positive Truth about Vision-Language Compositionality
Nov 1, 2023
Improving Interpersonal Communication by Simulating Audiences with Language Models
Nov 20, 2016
A Hierarchical Approach for Generating Descriptive Image Paragraphs