arXiv2
Search
Toggle theme
/ Date
/ Name
Search
/ Date
/ Name
"au:"Jihan Yao"" — arXiv2 Search
Showing 1–8 of 8 results
/ Date
/ Name
Oct 14, 2024
Varying Shades of Wrong: Aligning LLMs with Wrong Answers Only
Nov 25, 2025
Reading Between the Lines: Abstaining from VLM-Generated OCR Errors via Latent Representation Probes
May 23, 2025
MMMG: a Comprehensive and Reliable Evaluation Suite for Multitask Multimodal Generation
Feb 9, 2024
POTEC: Off-Policy Learning for Large Action Spaces via Two-Stage Policy Decomposition
Oct 9, 2025
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution
Feb 14, 2024
LLMAuditor: A Framework for Auditing Large Language Models Using Human-in-the-Loop
Jul 25, 2024
Know Your Limits: A Survey of Abstention in Large Language Models
Jan 29, 2026
MoCo: A One-Stop Shop for Model Collaboration Research