Showing 1–20 of 28 results
/ Date/ Name
Mar 4, 2026Phi-4-reasoning-vision-15B Technical ReportSep 29, 2023HoloAssist: an Egocentric Human Interaction Dataset for Interactive AI Assistants in the Real WorldJun 30, 2015Lens Factory: Automatic Lens Generation Using Off-the-shelf ComponentsJun 20, 2017Highly curved image sensors: a practical approach for improved optical performanceSep 12, 2023Beyond Generation: Harnessing Text to Image Models for Object Detection and SegmentationMay 29, 2023Controllable Text-to-Image Generation with GPT-4Mar 31, 2025Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies AheadApr 30, 2025Phi-4-reasoning Technical ReportJan 21, 2020Depth Completion Using a View-constrained Deep PriorJul 11, 2022Scaling Novel Object Detection with Weakly Supervised Detection TransformersJun 20, 2022DALL-E for Detection: Language-driven Compositional Image Synthesis for Object DetectionJun 21, 2024Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning for Vision Language ModelsJan 12, 2022Robust Contrastive Learning against Noisy ViewsApr 23, 2022Visual Attention Emerges from Recurrent Sparse ReconstructionApr 11, 2019Synthetic Examples Improve Generalization for Rare ClassesMar 31, 2017Semantic-driven Generation of Hyperlapse from $360^\circ$ VideoJul 22, 2022Neural-Sim: Learning to Generate Training Data with NeRFOct 17, 2024Unearthing Skill-Level Insights for Understanding Trade-Offs of Foundation ModelsOct 22, 2019A deep active learning system for species identification and counting in camera trap imagesJun 27, 2018Learn-to-Score: Efficient 3D Scene Exploration by Predicting View Utility