"au:"Jindong Gu"" — arXiv2 SearchShowing 1–7 of 7 results
/ Date/ Name
Jul 7, 2025Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic CapabilitiesJun 23, 2025AViLA: Asynchronous Vision-Language Agent for Streaming Multimodal Data InteractionApr 2, 2025On the Role of Feedback in Test-Time Scaling of Agentic AI WorkflowsSep 28, 2024Visual Question Decomposition on Multimodal Large Language ModelsJul 24, 2023A Systematic Survey of Prompt Engineering on Vision-Language Foundation ModelsApr 17, 2023Towards Robust Prompts on Vision-Language ModelsJul 25, 2022SegPGD: An Effective and Efficient Adversarial Attack for Evaluating and Boosting Segmentation Robustness