"au:"Joseph Marvin Imperial"" — arXiv2 SearchShowing 1–7 of 7 results
/ Date/ Name
Oct 28, 2025Global PIQA: Evaluating Physical Commonsense Reasoning Across 100+ Languages and CulturesAug 5, 2025FilBench: Can LLMs Understand and Generate Filipino?Apr 9, 2025Kaleidoscope: In-language Exams for Massively Multilingual Vision EvaluationMar 10, 2025Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast AsiaJan 24, 2025Humanity's Last ExamNov 29, 2024INCLUDE: Evaluating Multilingual Language Understanding with Regional KnowledgeJun 14, 2024SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages