Showing 1–12 of 12 results
/ Date/ Name
Nov 14, 2019HUSE: Hierarchical Universal Semantic EmbeddingsJun 15, 2020Multi-Image Summarization: Textual Summary from a Set of Cohesive ImagesJul 1, 2020Multimodal Text Style Transfer for Outdoor Vision-and-Language NavigationAug 23, 2020Leveraging Organizational Resources to Adapt Models to New Data ModalitiesMar 30, 2021Diagnosing Vision-and-Language Navigation: What Really MattersDec 19, 2022MetaCLUE: Towards Comprehensive Visual Metaphors ResearchOct 7, 2020Towards Understanding Sample Variance in Visually Grounded Language Generation: Evaluations and ObservationsOct 19, 2022CPL: Counterfactual Prompt Learning for Vision and Language ModelsMay 28, 2023KAFA: Rethinking Image Ad Understanding with Knowledge-Augmented Feature Adaptation of Vision-Language ModelsDec 19, 2023Gemini: A Family of Highly Capable Multimodal ModelsDec 9, 2022Training-Free Structured Diffusion Guidance for Compositional Text-to-Image SynthesisMay 18, 2023Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners