arXiv2
Search
Dark
/ Date
/ Name
Aa
W
/ Date
/ Name
"au:"Yuchen Shi"" — arXiv2 Search
Showing 1–4 of 4 results
/ Date
/ Name
Apr 23, 2026
CSC: Turning the Adversary's Poison against Itself
Nov 4, 2025
LTD-Bench: Evaluating Large Language Models by Letting Them Draw
Oct 21, 2025
CUARewardBench: A Benchmark for Evaluating Reward Models on Computer-using Agent
Jun 2, 2025
Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models